KeiSeiKit-1.0/docs/PHILOSOPHY.md
Parfii-bot 3759fb0f64 fix(audit-batch): CI green + RULE 0.4/0.16/0.18 honesty pass
12-agent audit (2 waves Opus+Sonnet, 6 slices each) flagged 3 HIGH-tier
issues that BOTH waves agreed on, plus 5 doc-honesty findings. This
batch fixes the lot.

== CI green (was failing on main 94a7d68) ==

- _primitives/_rust/Cargo.toml — workspace tokio gains `io-std` feature
  (needed by kei-mcp/src/main.rs which calls tokio::io::{stdin,stdout})
- _primitives/_rust/kei-mcp/Cargo.toml — dev-deps tokio gains `test-util`
  feature (needed by tests/tools_call_timeout.rs for tokio::time::advance
  and Builder::start_paused). Both verified locally:
  `cargo check -p kei-mcp` ✓
  `cargo test --no-run -p kei-mcp` ✓ (3 test binaries link)
  [REAL: ran 2026-05-03 in this session]

== HIGH-tier audit fixes (consensus across waves) ==

1. SQLi escape in agent-outcome-backfill.sh:110
   - 4 of 12 agents flagged: TOOL_USE_ID was JSON-derived and
     interpolated raw into SQL. Allowlist on $SHIPPED protected today
     but a future case-statement removal opened the surface.
   - Fix: tiny `_sql_esc` helper that doubles single-quotes (SQL-99
     standard escape), applied to SHIPPED + TOOL_USE_ID. STUBS already
     integer-validated.

2. PRAGMA user_version=9 in install/sql/outcome-only-schema.sql
   - W1 outcome-only critic flagged: the SQL fallback installed a
     v9-equivalent flat schema but left user_version=0. A LATER
     `kei-ledger init` (e.g. when user upgrades to full kit) would
     re-run migrations v1-v9 and ALTER TABLE ADD COLUMN duplicate-error
     mid-migration → broken DB.
   - Fix: set PRAGMA user_version=9 before COMMIT so the binary's
     migration runner sees current ≥ target and short-circuits.

3. backup_file mv→cp + uninstall macOS-portable awk
   - W1+W2 outcome-only flagged: lib-backup.sh uses `mv` which DELETES
     the target before _jq_merge_hooks runs; `|| true` swallowed the
     subsequent jq read-error → silent settings.json loss.
   - Fix in lib-profile-outcome-only.sh: `cp -p` aside, drop `|| true`,
     return 1 on merge failure (trap restores).
   - PROFILE-OUTCOME-ONLY.md uninstall used GNU sed `,+1` extension
     which BSD sed (macOS) does not support — uninstall silently
     no-op'd on macOS, leaving orphan CLAUDE.md text.
   - Fix: replace with portable `awk` recipe; also added `rm -f` for
     the agent-toolstats.jsonl sidecar (privacy completeness).

== Doc honesty pass (RULE 0.18 numerics + RULE 0.4 citations) ==

4. README.md count drift — verified all values against filesystem:
   * 102→105 Rust crates (Cargo.toml workspace `members` count)
   * 67→68 skills (`ls skills/ | wc -l`)
   * 35→38 hooks (`grep -c '"command":' settings-snippet.json`)
   * 37→38 agent manifests (`ls _manifests/*.toml | wc -l`)
   * 82→85 substrate blocks (`find _blocks/ -name '*.md' | wc -l`)
   * 18 capability atoms VERIFIED via `find _capabilities/ -name '*.md'`
     (encyclopedia §3 row count of 17 is in a separate file and is a
     known internal display issue, not changed in this commit)
   * 495→565 active DNAs (per docs/DNA-INDEX.md header 2026-05-03)
   Each value now carries a `[REAL: <command>]` style trailer per
   RULE 0.18.

5. README.md DNA "80-char identity" → "≥33-char variable-length"
   - W1+W2 reviewer-pass flagged FALSE: docs/DNA-FORMAT.md SSoT says
     minimum 33 chars; 80 was nowhere in code or spec
   - Fix in README.md:36 + docs/PHILOSOPHY.md:39 + docs/DNA-INDEX.md:1352

6. README.md "Eleven install profiles (... Cursor / Continue / Zed /
   Aider / Docker / Nix)" — Cursor/Continue/Zed/Aider/Docker/Nix were
   never install profiles, they were bridge targets
   - Fix: list 12 actual profiles from _primitives/MANIFEST.toml,
     mention bridges as separate concept

7. .claude-plugin/plugin.json license MIT → Apache-2.0
   - W2-Sonnet reviewer flagged: LICENSE file is Apache-2.0 (since
     2026-04-30 per NOTICE), but plugin.json still declared MIT —
     plugin marketplace would show wrong license

8. docs/ARCHITECTURE.md:318 placeholder URL `https://example.invalid/...`
   - W2-Sonnet reviewer flagged: dead link in published docs
   - Fix: remove the bad href, describe ssl-rule-file as per-user
     install outside the public repo

9. skills/sleep-on-it/SKILL.md Wagner et al. 2004 citation
   - W1+W2 reviewer flagged RULE 0.4 violation: citation without
     verification marker
   - Fix: added [VERIFIED: doi:10.1038/nature02223] + clarification
     that the original paper showed slow-wave-sleep (not strictly REM)
     insight gain — our metaphor is a loose mapping

10. encyclopedia/substrate-overview.md §5 fabricated TS deps
    - W1-Opus doc-consistency flagged RULE 0.4.b violation: 5 of 6
      package rows had INVENTED dependency strings
      (`recall-ai-sdk ^1.0.0`, `nodemailer-mock ^2.0.0`,
       `telegram-typings ^4.10.0`, etc — none exist in the actual
      package.json files)
    - Fix: regenerated table from real `package.json` reads via
      `node -p "require(...).dependencies"` for each of the 6 packages
    - Fix: also corrected version drift (5 packages all 0.14.0 now)

Verification:
- Outcome-only end-to-end install against fake $HOME succeeds:
  hooks installed, ledger schema at user_version=9, settings.json
  created cleanly, all 5 documented files present
  [REAL: ran 2026-05-03 in this session]
- `cargo check -p kei-mcp` + `cargo test --no-run -p kei-mcp` clean

Audit findings NOT yet addressed (deferred to next batch):
- README:65 git clone github URL — repo is private; reviewer flagged
  external strangers cannot clone; will resolve via Quick Start rewrite
- npm.pkg.github.com / @keisei84 leftover sweep — both waves verified
  ZERO refs, no fix needed
- safeEqual timing leak in TS server (W2 sec MEDIUM)
- HTTP server bind 0.0.0.0 (W2 sec MEDIUM)
- Unbounded request body (W2 ci MEDIUM)
- --dry-run silent ignored on non-outcome profiles (W1+W2 MEDIUM)
- Doc-link missing for MEMORY/DNA/LEDGER format specs from README

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 19:09:59 +08:00

265 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Philosophy — KeiSeiKit as a Living Neural Structure
> The README claims KeiSeiKit is "a living neural structure." This document
> is the long form of that claim: which biological properties we chose,
> why, how each one maps to code shipped in this repo, and what the
> tradeoffs are.
---
## The one-line thesis
A software toolkit accumulates features. A living neural structure
accumulates identity, memory, lineage, and the ability to recover from its
own mistakes. KeiSeiKit is built around the second list.
## The five properties
A neural system is distinguishable from a collection of functions by five
properties. Each one is load-bearing — remove it and the structure stops
being alive in the relevant sense.
1. **Identity** — each unit has a stable, reproducible name, not a random
UUID or a human-friendly slug that drifts.
2. **Lineage** — each unit knows who produced it and from what.
3. **Memory** — the system remembers what it did yesterday, across
restarts, across machines, across sessions.
4. **Consolidation** — memory is not raw logs; it is periodically replayed
and compressed into patterns.
5. **Corrective learning** — mistakes the system notices in itself
surface as explicit artefacts (rules, hooks, tests) the next session
inherits.
The rest of this document maps each property to the shipped code.
---
## 1. Identity — DNA
Every agent invocation resolves to a deterministic variable-length string (≥33 chars; role + caps slugs are variable):
```
<role>::<caps-bitmap>::<scope-hash>::<body-hash>-<nonce>
```
- `role` — role slug (e.g. `edit-local`, `research-web`).
- `caps-bitmap` — the resolved capability list, encoded as ordered 2-char
atom codes (`NG` = no-git-ops, `FW` = files-whitelist, `TG` = tests-green).
- `scope-hash` — 8 hex chars (32-bit) of SHA-256 over canonicalised scope.
- `body-hash` — 8 hex chars of SHA-256 over the task body text.
- `nonce` — 8 hex chars of `rand::random::<u32>()`, full 32-bit entropy.
The shape is enforced by `kei-agent-runtime::dna::parse`. Two invocations
with the same role, capability set, scope, and task body produce the same
`<role>::<caps>::<scope>::<body>` prefix, and differ only by nonce. That
makes DNA both deterministic (for reasoning) and unique (for collision
resistance — birthday threshold ≈ 65k agents per role+caps group).
**Why not UUIDs.** A UUID hides what the agent was supposed to do. A DNA
string is greppable: you can look at a ledger row and see the capability
bitmap without joining five tables.
**Why not slugs.** Slugs collide and drift. DNA is stable under renames
because the role slug is part of the hash input, not a sidecar.
Source: [`_primitives/_rust/kei-agent-runtime/src/dna.rs`](../_primitives/_rust/kei-agent-runtime/src/dna.rs).
## 2. Lineage — creator_id and fork_parent_id
Every row in the agent ledger carries two additional columns:
- `creator_id` — DNA or human id of whoever spawned this row.
- `fork_parent_id` — DNA of the agent this row was forked from, if any.
This is SQL-level lineage. Any artefact produced during a session can be
traced back to the agent that produced it, to the agent that spawned
*that* agent, to the human session the chain started from.
**Why this matters.** Software without lineage produces "where did this
file come from" questions at merge time. Lineage makes them disappear:
the merge-ceremony skill prints the DAG and the human picks which forks
merge, which squash, which defer.
Schema: [`_primitives/_rust/kei-ledger/src/schema.rs`](../_primitives/_rust/kei-ledger/src/schema.rs) migration v4.
## 3. Memory — three layers
The memory layer is deliberately three-tiered, mirroring the hippocampal
+ cortical split:
1. **Raw episodes** — session JSONL traces, append-only, one file per
session. This is the hippocampus: fast, stateful, volatile (survives
until the next full pull), not interpreted.
2. **Project memory**`memory/{project}.md` one file per project,
self-contained, constraints + stack + status + learnings with
evidence grades.
3. **Index**`MEMORY.md`, one line per project, ≤200 lines total. No
inline data. Reading this file gives the shape of the world.
Any session's "what did we decide last time" is a read of the
corresponding project file, not a scroll through chat history. The index
guarantees that read is fast.
**Why this layering.** A single giant memory file stops being read
because it cannot be scanned. A million tiny files stop being read
because there is no entry point. Three layers — index → project file →
raw traces — give you O(1) navigation to the right detail.
Source: `~/.claude/rules/memory-protocol.md` (the full memory-protocol
rule, reusable across projects).
## 4. Consolidation — REM and NREM
Raw traces become patterns only if something replays them. KeiSeiKit's
sleep layer runs in three phases on a nightly schedule:
### Phase A — Incubation ("sleep on it")
During the day, you drop tasks into `/sleep-on-it`. Each task gets a
priority (quick 15 min / standard 60 min / deep 240 min / marathon
480 min) and optionally a checkpoint cadence. At 03:00 local a remote
Claude Code agent on Anthropic's cloud picks up the queue (up to 480
minutes total across ≤ 5 tasks, packed greedily in FIFO order) and
works until the budget or checkpoint fires.
Biological analog: the overnight consolidation of un-finished intentions
(Wagner et al. 2004, *Nature*). Things unsolved when you fell asleep are
often solved by morning not because the brain ran harder, but because
it ran offline.
### Phase B — REM consolidation
After Phase A, the same agent reads the last 24 h of JSONL traces, diffs
them against the previous report, and writes
`reports/sleep-YYYY-MM-DD.md`. Cross-session patterns (≥ 3 occurrences
across ≥ 2 distinct sessions) are prepended to `backlog.md`.
Biological analog: REM dream-state. Pattern extraction, not raw replay.
### Phase C — NREM deep sleep
Every seven days (by default; configurable to zero to disable) the
pipeline also runs `kei-conflict-scan``kei-refactor-engine` → optional
`kei-graph-check`. The output is a **plan-only** markdown file or a
**plan + fork** branch (`deep-sleep/YYYY-MM-DD`) with `git apply`-ready
changes. Ambiguous conflicts are excluded from any auto-patch and listed
explicitly for human decision.
Biological analog: NREM slow-wave sleep. System-level consolidation.
Integrating, not just reviewing.
### The no-feedback-loop invariant
Nothing the cloud agent writes is ever auto-injected into a Claude Code
session. The morning report is for human review. Any rule or hook that
emerges from it is installed by hand via `/escalate-recurrence`. This
is a deliberate architectural choice: auto-learning loops without human
signoff are how models drift.
Source: [`docs/SLEEP-LAYER.md`](./SLEEP-LAYER.md) and `~/.claude/rules/sleep-layer.md`.
## 5. Corrective learning — self-audit
Three passive hooks run during any session:
- `session-end-dump` — on Stop, archives the session trace and ingests
it into `kei-memory`.
- `milestone-commit-hook` — on `feat:` / `refactor:` / merge commits,
appends a one-line summary to `audit-backlog.md`.
- `error-spike-detector` — when three or more errors occur in the last
twenty tool calls, tags the pattern.
These feed the `/self-audit` skill, which classifies recurring problems
and surfaces them via click-only `AskUserQuestion`. The user can
route a finding to:
- `/escalate-recurrence` — codify as rule + wiki + optional hook.
- `/debug-deep` — 5-phase root-cause analysis.
- hook-only — mechanical block / enforce / warn / remind.
- backlog — log, surface next session.
- postpone — keep open, resurface later.
**Silent-first mode.** The first ten sessions log only — no prompts.
This prevents false-positive fatigue while the memory store is still
empty. Session 11 onward, the self-audit starts surfacing items.
Source: `~/.claude/rules/session-self-audit.md` (RULE 0.14) and
`skills/self-audit/`.
---
## Growth — the sixth property, emergent
A substrate that satisfies properties 15 can support a sixth that is
harder to design for directly: **growth**. New primitives, new blocks,
new agents, new projects enter through user-driven commands and
accumulate in a way the next session can find.
- **New primitive.** `/compose-solution` decomposes a free-text problem,
greps existing atoms for prior art, and drafts a block if nothing
matches. The draft is persisted on user click, discoverable thereafter.
- **New agent.** `/spawn-agent` emits a manifest + DNA + ledger row. The
assembler hook rebuilds the markdown Claude Code picks up.
- **New project.** `/new-project` is a 4-phase skill: intake, fork
skeleton, parallel execution (orchestrator owns git per RULE 0.13),
merge ceremony.
Growth is not a feature we implemented. It is what the other five
properties produce when you compose them.
---
## What this is not
A neural network. The name "neural structure" is an analogy about
properties (identity, lineage, memory, consolidation, corrective
learning), not a claim about weights or gradients. Nothing in KeiSeiKit
trains on your data. The cloud agent in the sleep layer is a standard
Claude Code session with scheduled triggers — it reads your traces to
write a report, not to fine-tune itself.
A federation. As of v0.24, KeiSeiKit ships as a single-user substrate
installed next to Claude Code. Cross-user signing, marketplace publishing
of blocks, and federation are on the roadmap but not yet shipped. If a
doc claims otherwise, that doc is stale.
A framework. A framework tells you how to structure your application. A
substrate gives your agents identity, lineage, memory, and sleep —
nothing about it dictates the application. You can delete every skill
in this repo and the substrate still works; you can also add fifty more
and it still works.
---
## The constraints that shaped this
Three constraints, made explicit because they push back against common
defaults:
- **Constructor Pattern.** One file, one class, one concern. Files
greater than 200 lines are decomposed. Functions greater than 30
lines are split. No mixins, no DI containers, no abstract factories.
This keeps the graph readable by both humans and Claude.
- **Rust-first default.** New primitive code is Rust unless there is a
cited exception (ML training > 10M params / existing-language project
/ platform UI / browser-DOM / one-off < 50 lines / external binding
only / explicit user override). The reason is not performance it is
that the Rust type system catches the class of mistakes LLMs most
often introduce (`None` vs `[]`, missing `.await`, unhandled
`Result`) at compile time.
- **Local-first.** Nothing is pushed anywhere by default. The sleep
layer's memory-repo is user-owned, on whatever remote the user
chose (or no remote everything works locally).
If these constraints feel restrictive, they are deliberately. They
are the shape of the substrate, not decorations.
---
## Further reading
- [ARCHITECTURE.md](./ARCHITECTURE.md) build pipeline + bridges + meta-composer
- [SLEEP-LAYER.md](./SLEEP-LAYER.md) Phase A / B / C in depth
- [TAXONOMY.md](./TAXONOMY.md) the seven-facet vocabulary
- [SUBSTRATE-SCHEMA.md](./SUBSTRATE-SCHEMA.md) atom contract
- [WHY.md](./WHY.md) the full origin story