Single-commit clean baseline after security scrub of niche-tells, project codenames, internal jargon, and contributor-email leaks. Contents: - 100 Rust crates (_primitives/_rust/) - 37 agent manifests (_manifests/) + generated specs (_generated/) - 67 user-invocable skills (skills/) - 33 hooks (hooks/) - Composition blocks (_blocks/) - Documentation (docs/, README.md) - TS adapter packages (_ts_packages/) - Assembler (_assembler/) - Roles (_roles/) - Templates (_templates/) - Forgejo CI (.forgejo/) Author: Denis Parfionovich <info@greendragon.info> License: see LICENSE.
178 lines
5.7 KiB
Markdown
178 lines
5.7 KiB
Markdown
# Duplication Detection Rules
|
|
|
|
## Threshold Matrix
|
|
|
|
| Duplication Type | Threshold | Action |
|
|
|-----------------|-----------|--------|
|
|
| Exact copy (>10 lines) | 2 instances | Extract immediately |
|
|
| Similar structure (>70% match, >15 lines) | 3 instances | Extract to shared function |
|
|
| Similar structure (>70% match, 5-15 lines) | 4+ instances | Consider extraction |
|
|
| Config/boilerplate patterns | 5+ instances | Generate/template |
|
|
| 3 lines or fewer | Any count | Leave as is (3 lines < abstraction cost) |
|
|
|
|
## Detection Methods
|
|
|
|
### Method 1: Structural Grep
|
|
```bash
|
|
# Find duplicate function bodies (exact)
|
|
grep -rn "function_pattern" src/ | sort | uniq -d
|
|
|
|
# Find similar patterns with context
|
|
grep -rn "pattern" src/ --include="*.ts" -A 5
|
|
```
|
|
|
|
### Method 2: AST-Level Analysis
|
|
Look for these structural duplicates:
|
|
- Same `if/else` chain in multiple files
|
|
- Same error handling block repeated
|
|
- Same validation logic in different endpoints
|
|
- Same DB query with minor variations
|
|
- Same transformation pipeline
|
|
|
|
### Method 3: Git-Based Detection
|
|
```bash
|
|
# Files that always change together (shotgun surgery)
|
|
git log --name-only --pretty=format: | sort | uniq -c | sort -rn
|
|
|
|
# Similar commit patterns
|
|
git log --all --oneline | grep -i "fix.*same\|duplicate\|copy"
|
|
```
|
|
|
|
## Extraction Decision Tree
|
|
|
|
```
|
|
Found duplicate code?
|
|
│
|
|
├── Is it 3 lines or fewer?
|
|
│ └── YES → Leave it. Abstraction costs more than duplication.
|
|
│
|
|
├── Is it pure data/config?
|
|
│ └── YES → Extract to constants/config file
|
|
│
|
|
├── Is it identical across 2+ places?
|
|
│ ├── Same module → Extract to private helper
|
|
│ ├── Same layer → Extract to shared utility in that layer
|
|
│ └── Cross-layer → Extract to shared lib/package
|
|
│
|
|
├── Is it similar but not identical?
|
|
│ ├── Differs by 1-2 params → Parameterize (function args)
|
|
│ ├── Differs by behavior → Strategy pattern
|
|
│ ├── Differs by type → Generics / template
|
|
│ └── Differs significantly → Maybe NOT duplication (coincidence)
|
|
│
|
|
└── Is it boilerplate forced by framework?
|
|
└── YES → Code generation, templates, or decorators
|
|
```
|
|
|
|
## What IS and ISN'T Duplication
|
|
|
|
### IS Duplication (extract):
|
|
- Same validation logic in multiple API endpoints
|
|
- Same error formatting in multiple catch blocks
|
|
- Same DB query pattern (find by X, include Y, map to Z)
|
|
- Same auth check repeated across routes
|
|
- Same DTO transformation in multiple services
|
|
|
|
### IS NOT Duplication (leave alone):
|
|
- Similar but domain-different logic (calculating tax vs calculating discount)
|
|
- Test setup code (each test should be self-contained)
|
|
- Interface implementations that look similar but serve different contracts
|
|
- 2-3 lines of glue code (creating instance, calling method, returning result)
|
|
- Logging statements (same format, different context)
|
|
|
|
## Refactoring Patterns for Duplication
|
|
|
|
### Pattern 1: Extract Function
|
|
```
|
|
BEFORE:
|
|
// in file A
|
|
const user = await db.user.findUnique({ where: { id }, include: { profile: true } });
|
|
if (!user) throw new NotFoundException('User not found');
|
|
|
|
// in file B (identical)
|
|
const user = await db.user.findUnique({ where: { id }, include: { profile: true } });
|
|
if (!user) throw new NotFoundException('User not found');
|
|
|
|
AFTER:
|
|
// shared/users.ts
|
|
async function findUserOrThrow(id: string) {
|
|
const user = await db.user.findUnique({ where: { id }, include: { profile: true } });
|
|
if (!user) throw new NotFoundException('User not found');
|
|
return user;
|
|
}
|
|
```
|
|
|
|
### Pattern 2: Parameterize
|
|
```
|
|
BEFORE:
|
|
function getActiveUsers() { return db.user.findMany({ where: { status: 'active' } }); }
|
|
function getPendingUsers() { return db.user.findMany({ where: { status: 'pending' } }); }
|
|
|
|
AFTER:
|
|
function getUsersByStatus(status: UserStatus) {
|
|
return db.user.findMany({ where: { status } });
|
|
}
|
|
```
|
|
|
|
### Pattern 3: Strategy/Callback
|
|
```
|
|
BEFORE:
|
|
function processCSV(data) { parse(data); validate(data); saveToS3(data); }
|
|
function processJSON(data) { parse(data); validate(data); saveToDB(data); }
|
|
|
|
AFTER:
|
|
function processData(data, parser, saver) {
|
|
const parsed = parser(data);
|
|
validate(parsed);
|
|
saver(parsed);
|
|
}
|
|
```
|
|
|
|
### Pattern 4: Template Method
|
|
```
|
|
BEFORE:
|
|
class EmailNotifier { format() {...} send() { format(); deliver(); log(); } }
|
|
class SMSNotifier { format() {...} send() { format(); deliver(); log(); } }
|
|
|
|
AFTER:
|
|
abstract class Notifier {
|
|
abstract format(): string;
|
|
abstract deliver(): void;
|
|
send() { this.format(); this.deliver(); this.log(); }
|
|
}
|
|
```
|
|
|
|
## Cross-Module Duplication Rules
|
|
|
|
### Where to Put Shared Code
|
|
|
|
```
|
|
project/
|
|
├── src/
|
|
│ ├── shared/ # Cross-module utilities
|
|
│ │ ├── utils/ # Pure functions (no deps)
|
|
│ │ ├── types/ # Shared type definitions
|
|
│ │ └── constants/ # Shared constants
|
|
│ ├── modules/
|
|
│ │ ├── users/ # Module-specific code
|
|
│ │ └── orders/ # Module-specific code
|
|
│ └── lib/ # Framework-specific shared code
|
|
│ ├── db.ts # Database client
|
|
│ ├── redis.ts # Redis client
|
|
│ └── logger.ts # Logger config
|
|
```
|
|
|
|
### Monorepo Duplication
|
|
```
|
|
packages/
|
|
├── shared/ # Shared across all packages
|
|
│ ├── types/
|
|
│ ├── utils/
|
|
│ └── constants/
|
|
├── web/ # Uses shared/
|
|
├── api/ # Uses shared/
|
|
└── worker/ # Uses shared/
|
|
```
|
|
|
|
**Rule**: If 2+ packages duplicate the same logic, move to `shared/`.
|
|
**Exception**: Keep package-specific if the shared version would need too many conditionals.
|