KeiSeiKit-1.0/skills/architecture-rules/references/duplication.md
Parfii-bot 0be354a920 KeiSeiKit-public — clean state
Single-commit clean baseline after security scrub of niche-tells,
project codenames, internal jargon, and contributor-email leaks.

Contents:
- 100 Rust crates (_primitives/_rust/)
- 37 agent manifests (_manifests/) + generated specs (_generated/)
- 67 user-invocable skills (skills/)
- 33 hooks (hooks/)
- Composition blocks (_blocks/)
- Documentation (docs/, README.md)
- TS adapter packages (_ts_packages/)
- Assembler (_assembler/)
- Roles (_roles/)
- Templates (_templates/)
- Forgejo CI (.forgejo/)

Author: Denis Parfionovich <info@greendragon.info>

License: see LICENSE.
2026-05-01 12:09:03 +08:00

5.7 KiB

Duplication Detection Rules

Threshold Matrix

Duplication Type Threshold Action
Exact copy (>10 lines) 2 instances Extract immediately
Similar structure (>70% match, >15 lines) 3 instances Extract to shared function
Similar structure (>70% match, 5-15 lines) 4+ instances Consider extraction
Config/boilerplate patterns 5+ instances Generate/template
3 lines or fewer Any count Leave as is (3 lines < abstraction cost)

Detection Methods

Method 1: Structural Grep

# Find duplicate function bodies (exact)
grep -rn "function_pattern" src/ | sort | uniq -d

# Find similar patterns with context
grep -rn "pattern" src/ --include="*.ts" -A 5

Method 2: AST-Level Analysis

Look for these structural duplicates:

  • Same if/else chain in multiple files
  • Same error handling block repeated
  • Same validation logic in different endpoints
  • Same DB query with minor variations
  • Same transformation pipeline

Method 3: Git-Based Detection

# Files that always change together (shotgun surgery)
git log --name-only --pretty=format: | sort | uniq -c | sort -rn

# Similar commit patterns
git log --all --oneline | grep -i "fix.*same\|duplicate\|copy"

Extraction Decision Tree

Found duplicate code?
│
├── Is it 3 lines or fewer?
│   └── YES → Leave it. Abstraction costs more than duplication.
│
├── Is it pure data/config?
│   └── YES → Extract to constants/config file
│
├── Is it identical across 2+ places?
│   ├── Same module → Extract to private helper
│   ├── Same layer → Extract to shared utility in that layer
│   └── Cross-layer → Extract to shared lib/package
│
├── Is it similar but not identical?
│   ├── Differs by 1-2 params → Parameterize (function args)
│   ├── Differs by behavior → Strategy pattern
│   ├── Differs by type → Generics / template
│   └── Differs significantly → Maybe NOT duplication (coincidence)
│
└── Is it boilerplate forced by framework?
    └── YES → Code generation, templates, or decorators

What IS and ISN'T Duplication

IS Duplication (extract):

  • Same validation logic in multiple API endpoints
  • Same error formatting in multiple catch blocks
  • Same DB query pattern (find by X, include Y, map to Z)
  • Same auth check repeated across routes
  • Same DTO transformation in multiple services

IS NOT Duplication (leave alone):

  • Similar but domain-different logic (calculating tax vs calculating discount)
  • Test setup code (each test should be self-contained)
  • Interface implementations that look similar but serve different contracts
  • 2-3 lines of glue code (creating instance, calling method, returning result)
  • Logging statements (same format, different context)

Refactoring Patterns for Duplication

Pattern 1: Extract Function

BEFORE:
  // in file A
  const user = await db.user.findUnique({ where: { id }, include: { profile: true } });
  if (!user) throw new NotFoundException('User not found');

  // in file B (identical)
  const user = await db.user.findUnique({ where: { id }, include: { profile: true } });
  if (!user) throw new NotFoundException('User not found');

AFTER:
  // shared/users.ts
  async function findUserOrThrow(id: string) {
    const user = await db.user.findUnique({ where: { id }, include: { profile: true } });
    if (!user) throw new NotFoundException('User not found');
    return user;
  }

Pattern 2: Parameterize

BEFORE:
  function getActiveUsers() { return db.user.findMany({ where: { status: 'active' } }); }
  function getPendingUsers() { return db.user.findMany({ where: { status: 'pending' } }); }

AFTER:
  function getUsersByStatus(status: UserStatus) {
    return db.user.findMany({ where: { status } });
  }

Pattern 3: Strategy/Callback

BEFORE:
  function processCSV(data) { parse(data); validate(data); saveToS3(data); }
  function processJSON(data) { parse(data); validate(data); saveToDB(data); }

AFTER:
  function processData(data, parser, saver) {
    const parsed = parser(data);
    validate(parsed);
    saver(parsed);
  }

Pattern 4: Template Method

BEFORE:
  class EmailNotifier { format() {...}  send() { format(); deliver(); log(); } }
  class SMSNotifier   { format() {...}  send() { format(); deliver(); log(); } }

AFTER:
  abstract class Notifier {
    abstract format(): string;
    abstract deliver(): void;
    send() { this.format(); this.deliver(); this.log(); }
  }

Cross-Module Duplication Rules

Where to Put Shared Code

project/
├── src/
│   ├── shared/           # Cross-module utilities
│   │   ├── utils/        # Pure functions (no deps)
│   │   ├── types/        # Shared type definitions
│   │   └── constants/    # Shared constants
│   ├── modules/
│   │   ├── users/        # Module-specific code
│   │   └── orders/       # Module-specific code
│   └── lib/              # Framework-specific shared code
│       ├── db.ts         # Database client
│       ├── redis.ts      # Redis client
│       └── logger.ts     # Logger config

Monorepo Duplication

packages/
├── shared/               # Shared across all packages
│   ├── types/
│   ├── utils/
│   └── constants/
├── web/                  # Uses shared/
├── api/                  # Uses shared/
└── worker/               # Uses shared/

Rule: If 2+ packages duplicate the same logic, move to shared/. Exception: Keep package-specific if the shared version would need too many conditionals.