12-agent audit (2 waves Opus+Sonnet, 6 slices each) flagged 3 HIGH-tier
issues that BOTH waves agreed on, plus 5 doc-honesty findings. This
batch fixes the lot.
== CI green (was failing on main 94a7d68) ==
- _primitives/_rust/Cargo.toml — workspace tokio gains `io-std` feature
(needed by kei-mcp/src/main.rs which calls tokio::io::{stdin,stdout})
- _primitives/_rust/kei-mcp/Cargo.toml — dev-deps tokio gains `test-util`
feature (needed by tests/tools_call_timeout.rs for tokio::time::advance
and Builder::start_paused). Both verified locally:
`cargo check -p kei-mcp` ✓
`cargo test --no-run -p kei-mcp` ✓ (3 test binaries link)
[REAL: ran 2026-05-03 in this session]
== HIGH-tier audit fixes (consensus across waves) ==
1. SQLi escape in agent-outcome-backfill.sh:110
- 4 of 12 agents flagged: TOOL_USE_ID was JSON-derived and
interpolated raw into SQL. Allowlist on $SHIPPED protected today
but a future case-statement removal opened the surface.
- Fix: tiny `_sql_esc` helper that doubles single-quotes (SQL-99
standard escape), applied to SHIPPED + TOOL_USE_ID. STUBS already
integer-validated.
2. PRAGMA user_version=9 in install/sql/outcome-only-schema.sql
- W1 outcome-only critic flagged: the SQL fallback installed a
v9-equivalent flat schema but left user_version=0. A LATER
`kei-ledger init` (e.g. when user upgrades to full kit) would
re-run migrations v1-v9 and ALTER TABLE ADD COLUMN duplicate-error
mid-migration → broken DB.
- Fix: set PRAGMA user_version=9 before COMMIT so the binary's
migration runner sees current ≥ target and short-circuits.
3. backup_file mv→cp + uninstall macOS-portable awk
- W1+W2 outcome-only flagged: lib-backup.sh uses `mv` which DELETES
the target before _jq_merge_hooks runs; `|| true` swallowed the
subsequent jq read-error → silent settings.json loss.
- Fix in lib-profile-outcome-only.sh: `cp -p` aside, drop `|| true`,
return 1 on merge failure (trap restores).
- PROFILE-OUTCOME-ONLY.md uninstall used GNU sed `,+1` extension
which BSD sed (macOS) does not support — uninstall silently
no-op'd on macOS, leaving orphan CLAUDE.md text.
- Fix: replace with portable `awk` recipe; also added `rm -f` for
the agent-toolstats.jsonl sidecar (privacy completeness).
== Doc honesty pass (RULE 0.18 numerics + RULE 0.4 citations) ==
4. README.md count drift — verified all values against filesystem:
* 102→105 Rust crates (Cargo.toml workspace `members` count)
* 67→68 skills (`ls skills/ | wc -l`)
* 35→38 hooks (`grep -c '"command":' settings-snippet.json`)
* 37→38 agent manifests (`ls _manifests/*.toml | wc -l`)
* 82→85 substrate blocks (`find _blocks/ -name '*.md' | wc -l`)
* 18 capability atoms VERIFIED via `find _capabilities/ -name '*.md'`
(encyclopedia §3 row count of 17 is in a separate file and is a
known internal display issue, not changed in this commit)
* 495→565 active DNAs (per docs/DNA-INDEX.md header 2026-05-03)
Each value now carries a `[REAL: <command>]` style trailer per
RULE 0.18.
5. README.md DNA "80-char identity" → "≥33-char variable-length"
- W1+W2 reviewer-pass flagged FALSE: docs/DNA-FORMAT.md SSoT says
minimum 33 chars; 80 was nowhere in code or spec
- Fix in README.md:36 + docs/PHILOSOPHY.md:39 + docs/DNA-INDEX.md:1352
6. README.md "Eleven install profiles (... Cursor / Continue / Zed /
Aider / Docker / Nix)" — Cursor/Continue/Zed/Aider/Docker/Nix were
never install profiles, they were bridge targets
- Fix: list 12 actual profiles from _primitives/MANIFEST.toml,
mention bridges as separate concept
7. .claude-plugin/plugin.json license MIT → Apache-2.0
- W2-Sonnet reviewer flagged: LICENSE file is Apache-2.0 (since
2026-04-30 per NOTICE), but plugin.json still declared MIT —
plugin marketplace would show wrong license
8. docs/ARCHITECTURE.md:318 placeholder URL `https://example.invalid/...`
- W2-Sonnet reviewer flagged: dead link in published docs
- Fix: remove the bad href, describe ssl-rule-file as per-user
install outside the public repo
9. skills/sleep-on-it/SKILL.md Wagner et al. 2004 citation
- W1+W2 reviewer flagged RULE 0.4 violation: citation without
verification marker
- Fix: added [VERIFIED: doi:10.1038/nature02223] + clarification
that the original paper showed slow-wave-sleep (not strictly REM)
insight gain — our metaphor is a loose mapping
10. encyclopedia/substrate-overview.md §5 fabricated TS deps
- W1-Opus doc-consistency flagged RULE 0.4.b violation: 5 of 6
package rows had INVENTED dependency strings
(`recall-ai-sdk ^1.0.0`, `nodemailer-mock ^2.0.0`,
`telegram-typings ^4.10.0`, etc — none exist in the actual
package.json files)
- Fix: regenerated table from real `package.json` reads via
`node -p "require(...).dependencies"` for each of the 6 packages
- Fix: also corrected version drift (5 packages all 0.14.0 now)
Verification:
- Outcome-only end-to-end install against fake $HOME succeeds:
hooks installed, ledger schema at user_version=9, settings.json
created cleanly, all 5 documented files present
[REAL: ran 2026-05-03 in this session]
- `cargo check -p kei-mcp` + `cargo test --no-run -p kei-mcp` clean
Audit findings NOT yet addressed (deferred to next batch):
- README:65 git clone github URL — repo is private; reviewer flagged
external strangers cannot clone; will resolve via Quick Start rewrite
- npm.pkg.github.com / @keisei84 leftover sweep — both waves verified
ZERO refs, no fix needed
- safeEqual timing leak in TS server (W2 sec MEDIUM)
- HTTP server bind 0.0.0.0 (W2 sec MEDIUM)
- Unbounded request body (W2 ci MEDIUM)
- --dry-run silent ignored on non-outcome profiles (W1+W2 MEDIUM)
- Doc-link missing for MEMORY/DNA/LEDGER format specs from README
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
265 lines
11 KiB
Markdown
265 lines
11 KiB
Markdown
# Philosophy — KeiSeiKit as a Living Neural Structure
|
||
|
||
> The README claims KeiSeiKit is "a living neural structure." This document
|
||
> is the long form of that claim: which biological properties we chose,
|
||
> why, how each one maps to code shipped in this repo, and what the
|
||
> tradeoffs are.
|
||
|
||
---
|
||
|
||
## The one-line thesis
|
||
|
||
A software toolkit accumulates features. A living neural structure
|
||
accumulates identity, memory, lineage, and the ability to recover from its
|
||
own mistakes. KeiSeiKit is built around the second list.
|
||
|
||
## The five properties
|
||
|
||
A neural system is distinguishable from a collection of functions by five
|
||
properties. Each one is load-bearing — remove it and the structure stops
|
||
being alive in the relevant sense.
|
||
|
||
1. **Identity** — each unit has a stable, reproducible name, not a random
|
||
UUID or a human-friendly slug that drifts.
|
||
2. **Lineage** — each unit knows who produced it and from what.
|
||
3. **Memory** — the system remembers what it did yesterday, across
|
||
restarts, across machines, across sessions.
|
||
4. **Consolidation** — memory is not raw logs; it is periodically replayed
|
||
and compressed into patterns.
|
||
5. **Corrective learning** — mistakes the system notices in itself
|
||
surface as explicit artefacts (rules, hooks, tests) the next session
|
||
inherits.
|
||
|
||
The rest of this document maps each property to the shipped code.
|
||
|
||
---
|
||
|
||
## 1. Identity — DNA
|
||
|
||
Every agent invocation resolves to a deterministic variable-length string (≥33 chars; role + caps slugs are variable):
|
||
|
||
```
|
||
<role>::<caps-bitmap>::<scope-hash>::<body-hash>-<nonce>
|
||
```
|
||
|
||
- `role` — role slug (e.g. `edit-local`, `research-web`).
|
||
- `caps-bitmap` — the resolved capability list, encoded as ordered 2-char
|
||
atom codes (`NG` = no-git-ops, `FW` = files-whitelist, `TG` = tests-green).
|
||
- `scope-hash` — 8 hex chars (32-bit) of SHA-256 over canonicalised scope.
|
||
- `body-hash` — 8 hex chars of SHA-256 over the task body text.
|
||
- `nonce` — 8 hex chars of `rand::random::<u32>()`, full 32-bit entropy.
|
||
|
||
The shape is enforced by `kei-agent-runtime::dna::parse`. Two invocations
|
||
with the same role, capability set, scope, and task body produce the same
|
||
`<role>::<caps>::<scope>::<body>` prefix, and differ only by nonce. That
|
||
makes DNA both deterministic (for reasoning) and unique (for collision
|
||
resistance — birthday threshold ≈ 65k agents per role+caps group).
|
||
|
||
**Why not UUIDs.** A UUID hides what the agent was supposed to do. A DNA
|
||
string is greppable: you can look at a ledger row and see the capability
|
||
bitmap without joining five tables.
|
||
|
||
**Why not slugs.** Slugs collide and drift. DNA is stable under renames
|
||
because the role slug is part of the hash input, not a sidecar.
|
||
|
||
Source: [`_primitives/_rust/kei-agent-runtime/src/dna.rs`](../_primitives/_rust/kei-agent-runtime/src/dna.rs).
|
||
|
||
## 2. Lineage — creator_id and fork_parent_id
|
||
|
||
Every row in the agent ledger carries two additional columns:
|
||
|
||
- `creator_id` — DNA or human id of whoever spawned this row.
|
||
- `fork_parent_id` — DNA of the agent this row was forked from, if any.
|
||
|
||
This is SQL-level lineage. Any artefact produced during a session can be
|
||
traced back to the agent that produced it, to the agent that spawned
|
||
*that* agent, to the human session the chain started from.
|
||
|
||
**Why this matters.** Software without lineage produces "where did this
|
||
file come from" questions at merge time. Lineage makes them disappear:
|
||
the merge-ceremony skill prints the DAG and the human picks which forks
|
||
merge, which squash, which defer.
|
||
|
||
Schema: [`_primitives/_rust/kei-ledger/src/schema.rs`](../_primitives/_rust/kei-ledger/src/schema.rs) migration v4.
|
||
|
||
## 3. Memory — three layers
|
||
|
||
The memory layer is deliberately three-tiered, mirroring the hippocampal
|
||
+ cortical split:
|
||
|
||
1. **Raw episodes** — session JSONL traces, append-only, one file per
|
||
session. This is the hippocampus: fast, stateful, volatile (survives
|
||
until the next full pull), not interpreted.
|
||
2. **Project memory** — `memory/{project}.md` one file per project,
|
||
self-contained, constraints + stack + status + learnings with
|
||
evidence grades.
|
||
3. **Index** — `MEMORY.md`, one line per project, ≤200 lines total. No
|
||
inline data. Reading this file gives the shape of the world.
|
||
|
||
Any session's "what did we decide last time" is a read of the
|
||
corresponding project file, not a scroll through chat history. The index
|
||
guarantees that read is fast.
|
||
|
||
**Why this layering.** A single giant memory file stops being read
|
||
because it cannot be scanned. A million tiny files stop being read
|
||
because there is no entry point. Three layers — index → project file →
|
||
raw traces — give you O(1) navigation to the right detail.
|
||
|
||
Source: `~/.claude/rules/memory-protocol.md` (the full memory-protocol
|
||
rule, reusable across projects).
|
||
|
||
## 4. Consolidation — REM and NREM
|
||
|
||
Raw traces become patterns only if something replays them. KeiSeiKit's
|
||
sleep layer runs in three phases on a nightly schedule:
|
||
|
||
### Phase A — Incubation ("sleep on it")
|
||
|
||
During the day, you drop tasks into `/sleep-on-it`. Each task gets a
|
||
priority (quick 15 min / standard 60 min / deep 240 min / marathon
|
||
480 min) and optionally a checkpoint cadence. At 03:00 local a remote
|
||
Claude Code agent on Anthropic's cloud picks up the queue (up to 480
|
||
minutes total across ≤ 5 tasks, packed greedily in FIFO order) and
|
||
works until the budget or checkpoint fires.
|
||
|
||
Biological analog: the overnight consolidation of un-finished intentions
|
||
(Wagner et al. 2004, *Nature*). Things unsolved when you fell asleep are
|
||
often solved by morning not because the brain ran harder, but because
|
||
it ran offline.
|
||
|
||
### Phase B — REM consolidation
|
||
|
||
After Phase A, the same agent reads the last 24 h of JSONL traces, diffs
|
||
them against the previous report, and writes
|
||
`reports/sleep-YYYY-MM-DD.md`. Cross-session patterns (≥ 3 occurrences
|
||
across ≥ 2 distinct sessions) are prepended to `backlog.md`.
|
||
|
||
Biological analog: REM dream-state. Pattern extraction, not raw replay.
|
||
|
||
### Phase C — NREM deep sleep
|
||
|
||
Every seven days (by default; configurable to zero to disable) the
|
||
pipeline also runs `kei-conflict-scan` → `kei-refactor-engine` → optional
|
||
`kei-graph-check`. The output is a **plan-only** markdown file or a
|
||
**plan + fork** branch (`deep-sleep/YYYY-MM-DD`) with `git apply`-ready
|
||
changes. Ambiguous conflicts are excluded from any auto-patch and listed
|
||
explicitly for human decision.
|
||
|
||
Biological analog: NREM slow-wave sleep. System-level consolidation.
|
||
Integrating, not just reviewing.
|
||
|
||
### The no-feedback-loop invariant
|
||
|
||
Nothing the cloud agent writes is ever auto-injected into a Claude Code
|
||
session. The morning report is for human review. Any rule or hook that
|
||
emerges from it is installed by hand via `/escalate-recurrence`. This
|
||
is a deliberate architectural choice: auto-learning loops without human
|
||
signoff are how models drift.
|
||
|
||
Source: [`docs/SLEEP-LAYER.md`](./SLEEP-LAYER.md) and `~/.claude/rules/sleep-layer.md`.
|
||
|
||
## 5. Corrective learning — self-audit
|
||
|
||
Three passive hooks run during any session:
|
||
|
||
- `session-end-dump` — on Stop, archives the session trace and ingests
|
||
it into `kei-memory`.
|
||
- `milestone-commit-hook` — on `feat:` / `refactor:` / merge commits,
|
||
appends a one-line summary to `audit-backlog.md`.
|
||
- `error-spike-detector` — when three or more errors occur in the last
|
||
twenty tool calls, tags the pattern.
|
||
|
||
These feed the `/self-audit` skill, which classifies recurring problems
|
||
and surfaces them via click-only `AskUserQuestion`. The user can
|
||
route a finding to:
|
||
|
||
- `/escalate-recurrence` — codify as rule + wiki + optional hook.
|
||
- `/debug-deep` — 5-phase root-cause analysis.
|
||
- hook-only — mechanical block / enforce / warn / remind.
|
||
- backlog — log, surface next session.
|
||
- postpone — keep open, resurface later.
|
||
|
||
**Silent-first mode.** The first ten sessions log only — no prompts.
|
||
This prevents false-positive fatigue while the memory store is still
|
||
empty. Session 11 onward, the self-audit starts surfacing items.
|
||
|
||
Source: `~/.claude/rules/session-self-audit.md` (RULE 0.14) and
|
||
`skills/self-audit/`.
|
||
|
||
---
|
||
|
||
## Growth — the sixth property, emergent
|
||
|
||
A substrate that satisfies properties 1–5 can support a sixth that is
|
||
harder to design for directly: **growth**. New primitives, new blocks,
|
||
new agents, new projects enter through user-driven commands and
|
||
accumulate in a way the next session can find.
|
||
|
||
- **New primitive.** `/compose-solution` decomposes a free-text problem,
|
||
greps existing atoms for prior art, and drafts a block if nothing
|
||
matches. The draft is persisted on user click, discoverable thereafter.
|
||
- **New agent.** `/spawn-agent` emits a manifest + DNA + ledger row. The
|
||
assembler hook rebuilds the markdown Claude Code picks up.
|
||
- **New project.** `/new-project` is a 4-phase skill: intake, fork
|
||
skeleton, parallel execution (orchestrator owns git per RULE 0.13),
|
||
merge ceremony.
|
||
|
||
Growth is not a feature we implemented. It is what the other five
|
||
properties produce when you compose them.
|
||
|
||
---
|
||
|
||
## What this is not
|
||
|
||
A neural network. The name "neural structure" is an analogy about
|
||
properties (identity, lineage, memory, consolidation, corrective
|
||
learning), not a claim about weights or gradients. Nothing in KeiSeiKit
|
||
trains on your data. The cloud agent in the sleep layer is a standard
|
||
Claude Code session with scheduled triggers — it reads your traces to
|
||
write a report, not to fine-tune itself.
|
||
|
||
A federation. As of v0.24, KeiSeiKit ships as a single-user substrate
|
||
installed next to Claude Code. Cross-user signing, marketplace publishing
|
||
of blocks, and federation are on the roadmap but not yet shipped. If a
|
||
doc claims otherwise, that doc is stale.
|
||
|
||
A framework. A framework tells you how to structure your application. A
|
||
substrate gives your agents identity, lineage, memory, and sleep —
|
||
nothing about it dictates the application. You can delete every skill
|
||
in this repo and the substrate still works; you can also add fifty more
|
||
and it still works.
|
||
|
||
---
|
||
|
||
## The constraints that shaped this
|
||
|
||
Three constraints, made explicit because they push back against common
|
||
defaults:
|
||
|
||
- **Constructor Pattern.** One file, one class, one concern. Files
|
||
greater than 200 lines are decomposed. Functions greater than 30
|
||
lines are split. No mixins, no DI containers, no abstract factories.
|
||
This keeps the graph readable by both humans and Claude.
|
||
- **Rust-first default.** New primitive code is Rust unless there is a
|
||
cited exception (ML training > 10M params / existing-language project
|
||
/ platform UI / browser-DOM / one-off < 50 lines / external binding
|
||
only / explicit user override). The reason is not performance — it is
|
||
that the Rust type system catches the class of mistakes LLMs most
|
||
often introduce (`None` vs `[]`, missing `.await`, unhandled
|
||
`Result`) at compile time.
|
||
- **Local-first.** Nothing is pushed anywhere by default. The sleep
|
||
layer's memory-repo is user-owned, on whatever remote the user
|
||
chose (or no remote — everything works locally).
|
||
|
||
If these constraints feel restrictive, they are — deliberately. They
|
||
are the shape of the substrate, not decorations.
|
||
|
||
---
|
||
|
||
## Further reading
|
||
|
||
- [ARCHITECTURE.md](./ARCHITECTURE.md) — build pipeline + bridges + meta-composer
|
||
- [SLEEP-LAYER.md](./SLEEP-LAYER.md) — Phase A / B / C in depth
|
||
- [TAXONOMY.md](./TAXONOMY.md) — the seven-facet vocabulary
|
||
- [SUBSTRATE-SCHEMA.md](./SUBSTRATE-SCHEMA.md) — atom contract
|
||
- [WHY.md](./WHY.md) — the full origin story
|