KeiSeiKit-1.0/skills/architecture-rules/references/duplication.md
Parfii-bot a4e667de10 KeiSeiKit-public — clean state
Single-commit clean baseline after security scrub of niche-tells,
project codenames, internal jargon, and contributor-email leaks.

Contents:
- 100 Rust crates (_primitives/_rust/)
- 37 agent manifests (_manifests/) + generated specs (_generated/)
- 67 user-invocable skills (skills/)
- 33 hooks (hooks/)
- Composition blocks (_blocks/)
- Documentation (docs/, README.md)
- TS adapter packages (_ts_packages/)
- Assembler (_assembler/)
- Roles (_roles/)
- Templates (_templates/)
- Forgejo CI (.forgejo/)

Author: Denis Parfionovich <info@greendragon.info>

License: see LICENSE.
2026-05-01 12:09:03 +08:00

178 lines
5.7 KiB
Markdown

# Duplication Detection Rules
## Threshold Matrix
| Duplication Type | Threshold | Action |
|-----------------|-----------|--------|
| Exact copy (>10 lines) | 2 instances | Extract immediately |
| Similar structure (>70% match, >15 lines) | 3 instances | Extract to shared function |
| Similar structure (>70% match, 5-15 lines) | 4+ instances | Consider extraction |
| Config/boilerplate patterns | 5+ instances | Generate/template |
| 3 lines or fewer | Any count | Leave as is (3 lines < abstraction cost) |
## Detection Methods
### Method 1: Structural Grep
```bash
# Find duplicate function bodies (exact)
grep -rn "function_pattern" src/ | sort | uniq -d
# Find similar patterns with context
grep -rn "pattern" src/ --include="*.ts" -A 5
```
### Method 2: AST-Level Analysis
Look for these structural duplicates:
- Same `if/else` chain in multiple files
- Same error handling block repeated
- Same validation logic in different endpoints
- Same DB query with minor variations
- Same transformation pipeline
### Method 3: Git-Based Detection
```bash
# Files that always change together (shotgun surgery)
git log --name-only --pretty=format: | sort | uniq -c | sort -rn
# Similar commit patterns
git log --all --oneline | grep -i "fix.*same\|duplicate\|copy"
```
## Extraction Decision Tree
```
Found duplicate code?
├── Is it 3 lines or fewer?
│ └── YES → Leave it. Abstraction costs more than duplication.
├── Is it pure data/config?
│ └── YES → Extract to constants/config file
├── Is it identical across 2+ places?
│ ├── Same module → Extract to private helper
│ ├── Same layer → Extract to shared utility in that layer
│ └── Cross-layer → Extract to shared lib/package
├── Is it similar but not identical?
│ ├── Differs by 1-2 params → Parameterize (function args)
│ ├── Differs by behavior → Strategy pattern
│ ├── Differs by type → Generics / template
│ └── Differs significantly → Maybe NOT duplication (coincidence)
└── Is it boilerplate forced by framework?
└── YES → Code generation, templates, or decorators
```
## What IS and ISN'T Duplication
### IS Duplication (extract):
- Same validation logic in multiple API endpoints
- Same error formatting in multiple catch blocks
- Same DB query pattern (find by X, include Y, map to Z)
- Same auth check repeated across routes
- Same DTO transformation in multiple services
### IS NOT Duplication (leave alone):
- Similar but domain-different logic (calculating tax vs calculating discount)
- Test setup code (each test should be self-contained)
- Interface implementations that look similar but serve different contracts
- 2-3 lines of glue code (creating instance, calling method, returning result)
- Logging statements (same format, different context)
## Refactoring Patterns for Duplication
### Pattern 1: Extract Function
```
BEFORE:
// in file A
const user = await db.user.findUnique({ where: { id }, include: { profile: true } });
if (!user) throw new NotFoundException('User not found');
// in file B (identical)
const user = await db.user.findUnique({ where: { id }, include: { profile: true } });
if (!user) throw new NotFoundException('User not found');
AFTER:
// shared/users.ts
async function findUserOrThrow(id: string) {
const user = await db.user.findUnique({ where: { id }, include: { profile: true } });
if (!user) throw new NotFoundException('User not found');
return user;
}
```
### Pattern 2: Parameterize
```
BEFORE:
function getActiveUsers() { return db.user.findMany({ where: { status: 'active' } }); }
function getPendingUsers() { return db.user.findMany({ where: { status: 'pending' } }); }
AFTER:
function getUsersByStatus(status: UserStatus) {
return db.user.findMany({ where: { status } });
}
```
### Pattern 3: Strategy/Callback
```
BEFORE:
function processCSV(data) { parse(data); validate(data); saveToS3(data); }
function processJSON(data) { parse(data); validate(data); saveToDB(data); }
AFTER:
function processData(data, parser, saver) {
const parsed = parser(data);
validate(parsed);
saver(parsed);
}
```
### Pattern 4: Template Method
```
BEFORE:
class EmailNotifier { format() {...} send() { format(); deliver(); log(); } }
class SMSNotifier { format() {...} send() { format(); deliver(); log(); } }
AFTER:
abstract class Notifier {
abstract format(): string;
abstract deliver(): void;
send() { this.format(); this.deliver(); this.log(); }
}
```
## Cross-Module Duplication Rules
### Where to Put Shared Code
```
project/
├── src/
│ ├── shared/ # Cross-module utilities
│ │ ├── utils/ # Pure functions (no deps)
│ │ ├── types/ # Shared type definitions
│ │ └── constants/ # Shared constants
│ ├── modules/
│ │ ├── users/ # Module-specific code
│ │ └── orders/ # Module-specific code
│ └── lib/ # Framework-specific shared code
│ ├── db.ts # Database client
│ ├── redis.ts # Redis client
│ └── logger.ts # Logger config
```
### Monorepo Duplication
```
packages/
├── shared/ # Shared across all packages
│ ├── types/
│ ├── utils/
│ └── constants/
├── web/ # Uses shared/
├── api/ # Uses shared/
└── worker/ # Uses shared/
```
**Rule**: If 2+ packages duplicate the same logic, move to `shared/`.
**Exception**: Keep package-specific if the shared version would need too many conditionals.