KeiSeiKit-1.0/_blocks/db-migration-hygiene.md
Parfii-bot 0be354a920 KeiSeiKit-public — clean state
Single-commit clean baseline after security scrub of niche-tells,
project codenames, internal jargon, and contributor-email leaks.

Contents:
- 100 Rust crates (_primitives/_rust/)
- 37 agent manifests (_manifests/) + generated specs (_generated/)
- 67 user-invocable skills (skills/)
- 33 hooks (hooks/)
- Composition blocks (_blocks/)
- Documentation (docs/, README.md)
- TS adapter packages (_ts_packages/)
- Assembler (_assembler/)
- Roles (_roles/)
- Templates (_templates/)
- Forgejo CI (.forgejo/)

Author: Denis Parfionovich <info@greendragon.info>

License: see LICENSE.
2026-05-01 12:09:03 +08:00

2.4 KiB

DB — Migration hygiene (universal)

Applies to every migration tool — kei-migrate, Atlas, goose, sqlx-cli, drizzle-kit, Alembic, Prisma migrate, Ecto migrations. [E4 — expert assessment]

Numbering: timestamp prefix, not integer. 20260421_120000_add_users_email_index.sql sorts correctly forever and doesn't collide on parallel branches. Integer sequences (0001_, 0002_) collide on merge; reject them in review.

Up + down pairs: every migration has a reverse. If the reverse is destructive and unsafe (e.g. dropping a column with data), write a -- IRREVERSIBLE comment and stop the down-script there. NEVER auto-run destructive downs on prod without a human click.

Idempotent where possible:

CREATE TABLE IF NOT EXISTS users (...);
CREATE INDEX IF NOT EXISTS idx_users_email ON users(email);
ALTER TABLE users ADD COLUMN IF NOT EXISTS bio TEXT;    -- PG 9.6+, verify per-DB

Re-running a partially-applied migration should be safe. A migration that crashes mid-way and can't be re-run = 2AM incident waiting to happen.

Zero-downtime pattern (add-then-drop):

  1. Deploy migration that ADDS new column / table (old code still works).
  2. Deploy app code that writes BOTH old + new.
  3. Backfill old → new.
  4. Deploy app code that reads new, ignores old.
  5. Deploy migration that DROPS old column.

Never DROP + ADD RENAME in one migration on a live table. That's a table lock + app-downtime event.

Backfill patterns:

  • Small table (< 1M rows): UPDATE ... SET new = f(old) in a single migration.
  • Large table: background job with batched UPDATE ... WHERE id BETWEEN ? AND ? + LIMIT. Commit per batch. Monitor lag.
  • Very large (> 100M rows): use the DB's native tooling (PG VACUUM FULL not needed; pg_repack if column-add bloats). [UNVERIFIED: verify on current PG docs]

Tracking table (_kei_migrations or equivalent): stores (version, name, checksum, applied_at). Checksum prevents silent tampering with an already-applied file. If checksum mismatches on an applied migration → hard-fail, demand human intervention.

Forbidden: editing a migration file after it's been applied on any environment (checksum break); DROP TABLE without backup + 24h cooldown; mixing DDL + large DML in one transaction (long locks); running migrations automatically on app startup in multi-replica deploys without a leader-election guard (every replica tries to apply = race condition).