Parfii-bot 209460df6b fix(audit-batch-2): regressions from prev batch + 2nd-wave audit findings

12-agent audit (waves 3+4 Opus+Sonnet) on commit 88de01c found that 2 of
my prior fixes had regressions, plus the prev batch missed 8 stale-text
sites and 2 latent bugs. This batch closes them all.

== Regressions in audit-batch (88de01c) — now fixed ==

1. PRAGMA user_version=9 placement — could silently downgrade schema on
   cross-version install (existing v10 DB → re-run reset to 9 →
   migrations replay → ALTER TABLE duplicate-column errors)
   - install/sql/outcome-only-schema.sql: PRAGMA moved OUTSIDE the
     transaction (after COMMIT) for portability across SQLite versions
   - install/lib-profile-outcome-only.sh::_outcome_install_ledger:
     added downgrade guard — reads existing user_version BEFORE running
     ANY init path; if >9, skips entirely (preserves newer schema)
   - VERIFIED: simulated v10 DB → re-run prints "skipping init to
     preserve newer schema"; user_version stays at 10 (was downgraded
     to 9 in the prior batch) [REAL: ran in this session]

2. backup_file mv→cp workaround left orphan backups + bypassed rollback
   contract (BACKUP_PAIRS not registered)
   - install/lib-profile-outcome-only.sh: now manually appends to
     BACKUP_PAIRS so rollback trap restores on later failure;
     removes the .bak on success path
   - Comment updated to explain the workaround vs backup_file mv

3. CLAUDE.md skip-guard "STATUS-TRUTH MARKER" was too broad —
   false-positive on existing kit users (RULE 0.16 doc text matches)
   - lib-profile-outcome-only.sh: changed grep to literal HTML comment
     marker `<!-- outcome-only profile (KeiSeiKit) -->` (specific marker
     written by the installer itself)

== Tier 1 missed in prev batch — now fixed ==

4. _ts_packages/package-lock.json referenced packages/cortex-ui which
   does NOT exist on disk → npm ci would fail with ELSPROBLEMS in CI
   - Regenerated via fresh `rm package-lock.json && npm install`
   - npm ci now exits 0 cleanly [REAL: ran in this session]
   - Lockfile shrunk 2403→0 lines on the cortex-ui section (full regen)

5. v3 triggers (branch length cap ≤256) were MISSING from
   outcome-only-schema.sql — sqlite3 fallback path skipped a schema
   feature that the Rust kei-ledger flow enforces, creating cross-flow
   drift
   - Added trg_agents_branch_len_ins + trg_agents_branch_len_upd
     mirroring migrations_list.rs:30-44
   - Header comment in outcome-only-schema.sql rewritten to match
     current behavior (was stale)
   - VERIFIED: end-to-end install creates 2 triggers [REAL: sqlite3
     .schema | grep trg_agents_branch_len returns 2]

6. README.md:232 said "102 crates" while README.md:9 said "105 crates"
   — internal contradiction in same doc
   - README:232 → "105 workspace crates"

7. ARCHITECTURE.md:165 "53 Rust crates + 13 shell primitives" stale
   - Updated to "105 Rust workspace crates (47 declared in MANIFEST.toml
     `full` profile) + 14 shell primitives"

8. ARCHITECTURE.md:157 "45 /commands" stale
   - Updated to 68

9. plugin.json + marketplace.json description strings still had
   pre-fix counts (23 primitives / 39 skills / 9 hooks / 12 agents)
   - Both rewritten to match README:9 SSoT (38 agents / 68 skills /
     38 hooks / 105 workspace crates / 47 installable + 14 shell)

10. PROFILE-OUTCOME-ONLY.md:28-29 "What does NOT get installed" still
    cited 102/67/37/82
    - Updated to 105/68/38/85

11. encyclopedia/substrate-overview.md §6/§11/§12 still said
    "80-char DNA"; §13 said "495 DNA indices"; §6 said "11 install
    profiles (.../Cursor/Continue/etc)"
    - All 4 sites fixed to current language (≥33-char variable, 565
      DNAs, 12 install profiles)

12. docs/DNA-INDEX.md:1352 said wire format is "(80 chars)"
    - Updated to "(≥33 chars; role + caps slugs are variable — see
      docs/DNA-FORMAT.md)"

== Tier 2 honesty fixes ==

13. Wagner et al. 2004 citation in SLEEP-LAYER.md:26 lacked [VERIFIED]
    marker (W3 doc consistency caught it)
    - Added [VERIFIED: doi:10.1038/nature02223] + clarification that
      the original study did not isolate a specific sleep stage; SWS
      attribution comes from secondary literature (Diekelmann/Born)

14. PHILOSOPHY.md:125 attributed "overnight consolidation of un-finished
    intentions" to Wagner 2004 — that paper is about insight gain on
    the Number Reduction Task, not Zeigarnik-effect cued memory
    - Rewritten to accurately describe Wagner 2004's actual finding +
      [VERIFIED: doi:10.1038/nature02223]

Verification:
- `npm ci` in _ts_packages/ exits 0 [REAL: ran in this session]
- `cargo check --workspace` exits 0 in _primitives/_rust [REAL: ran in
  this session]
- Outcome-only end-to-end fresh install produces user_version=9 +
  2 triggers (correct schema shape)
- Outcome-only re-run against v10 DB preserves user_version=10
  (downgrade guard works)
- CLAUDE.md skip-guard now triggers ONLY on literal marker, not on
  RULE 0.16 phrase

NOT addressed in this batch (deferred to a future round):
- github KeiSei84/{KeiSeiKit, KeiSeiKit-1.0} 404 (user-side action:
  publish repo or update refs)
- keigit user `keisei` does not exist (user-side: create org or
  rename scope)
- KEIGIT_TOKEN secret not configured (user-side action)
- Forgejo registration disabled (admin-side)
- safeEqual timing leak in TS server (LOW per W3 reassessment)
- HTTP bind 0.0.0.0 default (MEDIUM)
- Unbounded request body (MEDIUM)
- Outcome-only confirm-screen bypass (RULE 0.1 spirit)
- Ledger fallthrough false summary
- Node 20 deprecation (deadline 2026-06-02, 30 days)
- Hook count triple-discrepancy (38 README / 53 DNA-INDEX / 35 maturity-row)
- 100-row router claim still in README:117 + PROFILE-OUTCOME-ONLY.md
- INSTALL.md numerics without [REAL:] markers
- Stale .bak files accumulation policy (cosmetic)
- README per-claim [REAL: ] markers for 6 of 7 numerics

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-03 20:30:51 +08:00

11 KiB

Raw Blame History

Philosophy — KeiSeiKit as a Living Neural Structure

The README claims KeiSeiKit is "a living neural structure." This document is the long form of that claim: which biological properties we chose, why, how each one maps to code shipped in this repo, and what the tradeoffs are.

The one-line thesis

A software toolkit accumulates features. A living neural structure accumulates identity, memory, lineage, and the ability to recover from its own mistakes. KeiSeiKit is built around the second list.

The five properties

A neural system is distinguishable from a collection of functions by five properties. Each one is load-bearing — remove it and the structure stops being alive in the relevant sense.

Identity — each unit has a stable, reproducible name, not a random UUID or a human-friendly slug that drifts.
Lineage — each unit knows who produced it and from what.
Memory — the system remembers what it did yesterday, across restarts, across machines, across sessions.
Consolidation — memory is not raw logs; it is periodically replayed and compressed into patterns.
Corrective learning — mistakes the system notices in itself surface as explicit artefacts (rules, hooks, tests) the next session inherits.

The rest of this document maps each property to the shipped code.

1. Identity — DNA

Every agent invocation resolves to a deterministic variable-length string (≥33 chars; role + caps slugs are variable):

<role>::<caps-bitmap>::<scope-hash>::<body-hash>-<nonce>

role — role slug (e.g. edit-local, research-web).
caps-bitmap — the resolved capability list, encoded as ordered 2-char atom codes (NG = no-git-ops, FW = files-whitelist, TG = tests-green).
scope-hash — 8 hex chars (32-bit) of SHA-256 over canonicalised scope.
body-hash — 8 hex chars of SHA-256 over the task body text.
nonce — 8 hex chars of rand::random::<u32>(), full 32-bit entropy.

The shape is enforced by kei-agent-runtime::dna::parse. Two invocations with the same role, capability set, scope, and task body produce the same <role>::<caps>::<scope>::<body> prefix, and differ only by nonce. That makes DNA both deterministic (for reasoning) and unique (for collision resistance — birthday threshold ≈ 65k agents per role+caps group).

Why not UUIDs. A UUID hides what the agent was supposed to do. A DNA string is greppable: you can look at a ledger row and see the capability bitmap without joining five tables.

Why not slugs. Slugs collide and drift. DNA is stable under renames because the role slug is part of the hash input, not a sidecar.

Source: _primitives/_rust/kei-agent-runtime/src/dna.rs.

2. Lineage — creator_id and fork_parent_id

Every row in the agent ledger carries two additional columns:

creator_id — DNA or human id of whoever spawned this row.
fork_parent_id — DNA of the agent this row was forked from, if any.

This is SQL-level lineage. Any artefact produced during a session can be traced back to the agent that produced it, to the agent that spawned that agent, to the human session the chain started from.

Why this matters. Software without lineage produces "where did this file come from" questions at merge time. Lineage makes them disappear: the merge-ceremony skill prints the DAG and the human picks which forks merge, which squash, which defer.

Schema: _primitives/_rust/kei-ledger/src/schema.rs migration v4.

3. Memory — three layers

The memory layer is deliberately three-tiered, mirroring the hippocampal

cortical split:

Raw episodes — session JSONL traces, append-only, one file per session. This is the hippocampus: fast, stateful, volatile (survives until the next full pull), not interpreted.
Project memory — memory/{project}.md one file per project, self-contained, constraints + stack + status + learnings with evidence grades.
Index — MEMORY.md, one line per project, ≤200 lines total. No inline data. Reading this file gives the shape of the world.

Any session's "what did we decide last time" is a read of the corresponding project file, not a scroll through chat history. The index guarantees that read is fast.

Why this layering. A single giant memory file stops being read because it cannot be scanned. A million tiny files stop being read because there is no entry point. Three layers — index → project file → raw traces — give you O(1) navigation to the right detail.

Source: ~/.claude/rules/memory-protocol.md (the full memory-protocol rule, reusable across projects).

4. Consolidation — REM and NREM

Raw traces become patterns only if something replays them. KeiSeiKit's sleep layer runs in three phases on a nightly schedule:

Phase A — Incubation ("sleep on it")

During the day, you drop tasks into /sleep-on-it. Each task gets a priority (quick 15 min / standard 60 min / deep 240 min / marathon 480 min) and optionally a checkpoint cadence. At 03:00 local a remote Claude Code agent on Anthropic's cloud picks up the queue (up to 480 minutes total across ≤ 5 tasks, packed greedily in FIFO order) and works until the budget or checkpoint fires.

Biological analog: the post-sleep insight gain documented by Wagner et al. 2004, Nature 427:352–355 [VERIFIED: doi:10.1038/nature02223]. The original paper showed that problems unsolved when you went to bed are sometimes solved on waking not because the brain ran harder but because it ran offline; the study did not isolate a specific sleep stage, and our metaphor is a loose mapping of that observation onto the kit's offline consolidation.

Phase B — REM consolidation

After Phase A, the same agent reads the last 24 h of JSONL traces, diffs them against the previous report, and writes reports/sleep-YYYY-MM-DD.md. Cross-session patterns (≥ 3 occurrences across ≥ 2 distinct sessions) are prepended to backlog.md.

Biological analog: REM dream-state. Pattern extraction, not raw replay.

Phase C — NREM deep sleep

Every seven days (by default; configurable to zero to disable) the pipeline also runs kei-conflict-scan → kei-refactor-engine → optional kei-graph-check. The output is a plan-only markdown file or a plan + fork branch (deep-sleep/YYYY-MM-DD) with git apply-ready changes. Ambiguous conflicts are excluded from any auto-patch and listed explicitly for human decision.

Biological analog: NREM slow-wave sleep. System-level consolidation. Integrating, not just reviewing.

The no-feedback-loop invariant

Nothing the cloud agent writes is ever auto-injected into a Claude Code session. The morning report is for human review. Any rule or hook that emerges from it is installed by hand via /escalate-recurrence. This is a deliberate architectural choice: auto-learning loops without human signoff are how models drift.

Source: docs/SLEEP-LAYER.md and ~/.claude/rules/sleep-layer.md.

5. Corrective learning — self-audit

Three passive hooks run during any session:

session-end-dump — on Stop, archives the session trace and ingests it into kei-memory.
milestone-commit-hook — on feat: / refactor: / merge commits, appends a one-line summary to audit-backlog.md.
error-spike-detector — when three or more errors occur in the last twenty tool calls, tags the pattern.

These feed the /self-audit skill, which classifies recurring problems and surfaces them via click-only AskUserQuestion. The user can route a finding to:

/escalate-recurrence — codify as rule + wiki + optional hook.
/debug-deep — 5-phase root-cause analysis.
hook-only — mechanical block / enforce / warn / remind.
backlog — log, surface next session.
postpone — keep open, resurface later.

Silent-first mode. The first ten sessions log only — no prompts. This prevents false-positive fatigue while the memory store is still empty. Session 11 onward, the self-audit starts surfacing items.

Source: ~/.claude/rules/session-self-audit.md (RULE 0.14) and skills/self-audit/.

Growth — the sixth property, emergent

A substrate that satisfies properties 1–5 can support a sixth that is harder to design for directly: growth. New primitives, new blocks, new agents, new projects enter through user-driven commands and accumulate in a way the next session can find.

New primitive. /compose-solution decomposes a free-text problem, greps existing atoms for prior art, and drafts a block if nothing matches. The draft is persisted on user click, discoverable thereafter.
New agent. /spawn-agent emits a manifest + DNA + ledger row. The assembler hook rebuilds the markdown Claude Code picks up.
New project. /new-project is a 4-phase skill: intake, fork skeleton, parallel execution (orchestrator owns git per RULE 0.13), merge ceremony.

Growth is not a feature we implemented. It is what the other five properties produce when you compose them.

What this is not

A neural network. The name "neural structure" is an analogy about properties (identity, lineage, memory, consolidation, corrective learning), not a claim about weights or gradients. Nothing in KeiSeiKit trains on your data. The cloud agent in the sleep layer is a standard Claude Code session with scheduled triggers — it reads your traces to write a report, not to fine-tune itself.

A federation. As of v0.24, KeiSeiKit ships as a single-user substrate installed next to Claude Code. Cross-user signing, marketplace publishing of blocks, and federation are on the roadmap but not yet shipped. If a doc claims otherwise, that doc is stale.

A framework. A framework tells you how to structure your application. A substrate gives your agents identity, lineage, memory, and sleep — nothing about it dictates the application. You can delete every skill in this repo and the substrate still works; you can also add fifty more and it still works.

The constraints that shaped this

Three constraints, made explicit because they push back against common defaults:

Constructor Pattern. One file, one class, one concern. Files greater than 200 lines are decomposed. Functions greater than 30 lines are split. No mixins, no DI containers, no abstract factories. This keeps the graph readable by both humans and Claude.
Rust-first default. New primitive code is Rust unless there is a cited exception (ML training > 10M params / existing-language project / platform UI / browser-DOM / one-off < 50 lines / external binding only / explicit user override). The reason is not performance — it is that the Rust type system catches the class of mistakes LLMs most often introduce (None vs [], missing .await, unhandled Result) at compile time.
Local-first. Nothing is pushed anywhere by default. The sleep layer's memory-repo is user-owned, on whatever remote the user chose (or no remote — everything works locally).

If these constraints feel restrictive, they are — deliberately. They are the shape of the substrate, not decorations.

11 KiB Raw Blame History Unescape Escape