KeiSeiKit-1.0/docs/PROFILE-OUTCOME-ONLY.md
Parfii-bot 3759fb0f64 fix(audit-batch): CI green + RULE 0.4/0.16/0.18 honesty pass
12-agent audit (2 waves Opus+Sonnet, 6 slices each) flagged 3 HIGH-tier
issues that BOTH waves agreed on, plus 5 doc-honesty findings. This
batch fixes the lot.

== CI green (was failing on main 94a7d68) ==

- _primitives/_rust/Cargo.toml — workspace tokio gains `io-std` feature
  (needed by kei-mcp/src/main.rs which calls tokio::io::{stdin,stdout})
- _primitives/_rust/kei-mcp/Cargo.toml — dev-deps tokio gains `test-util`
  feature (needed by tests/tools_call_timeout.rs for tokio::time::advance
  and Builder::start_paused). Both verified locally:
  `cargo check -p kei-mcp` ✓
  `cargo test --no-run -p kei-mcp` ✓ (3 test binaries link)
  [REAL: ran 2026-05-03 in this session]

== HIGH-tier audit fixes (consensus across waves) ==

1. SQLi escape in agent-outcome-backfill.sh:110
   - 4 of 12 agents flagged: TOOL_USE_ID was JSON-derived and
     interpolated raw into SQL. Allowlist on $SHIPPED protected today
     but a future case-statement removal opened the surface.
   - Fix: tiny `_sql_esc` helper that doubles single-quotes (SQL-99
     standard escape), applied to SHIPPED + TOOL_USE_ID. STUBS already
     integer-validated.

2. PRAGMA user_version=9 in install/sql/outcome-only-schema.sql
   - W1 outcome-only critic flagged: the SQL fallback installed a
     v9-equivalent flat schema but left user_version=0. A LATER
     `kei-ledger init` (e.g. when user upgrades to full kit) would
     re-run migrations v1-v9 and ALTER TABLE ADD COLUMN duplicate-error
     mid-migration → broken DB.
   - Fix: set PRAGMA user_version=9 before COMMIT so the binary's
     migration runner sees current ≥ target and short-circuits.

3. backup_file mv→cp + uninstall macOS-portable awk
   - W1+W2 outcome-only flagged: lib-backup.sh uses `mv` which DELETES
     the target before _jq_merge_hooks runs; `|| true` swallowed the
     subsequent jq read-error → silent settings.json loss.
   - Fix in lib-profile-outcome-only.sh: `cp -p` aside, drop `|| true`,
     return 1 on merge failure (trap restores).
   - PROFILE-OUTCOME-ONLY.md uninstall used GNU sed `,+1` extension
     which BSD sed (macOS) does not support — uninstall silently
     no-op'd on macOS, leaving orphan CLAUDE.md text.
   - Fix: replace with portable `awk` recipe; also added `rm -f` for
     the agent-toolstats.jsonl sidecar (privacy completeness).

== Doc honesty pass (RULE 0.18 numerics + RULE 0.4 citations) ==

4. README.md count drift — verified all values against filesystem:
   * 102→105 Rust crates (Cargo.toml workspace `members` count)
   * 67→68 skills (`ls skills/ | wc -l`)
   * 35→38 hooks (`grep -c '"command":' settings-snippet.json`)
   * 37→38 agent manifests (`ls _manifests/*.toml | wc -l`)
   * 82→85 substrate blocks (`find _blocks/ -name '*.md' | wc -l`)
   * 18 capability atoms VERIFIED via `find _capabilities/ -name '*.md'`
     (encyclopedia §3 row count of 17 is in a separate file and is a
     known internal display issue, not changed in this commit)
   * 495→565 active DNAs (per docs/DNA-INDEX.md header 2026-05-03)
   Each value now carries a `[REAL: <command>]` style trailer per
   RULE 0.18.

5. README.md DNA "80-char identity" → "≥33-char variable-length"
   - W1+W2 reviewer-pass flagged FALSE: docs/DNA-FORMAT.md SSoT says
     minimum 33 chars; 80 was nowhere in code or spec
   - Fix in README.md:36 + docs/PHILOSOPHY.md:39 + docs/DNA-INDEX.md:1352

6. README.md "Eleven install profiles (... Cursor / Continue / Zed /
   Aider / Docker / Nix)" — Cursor/Continue/Zed/Aider/Docker/Nix were
   never install profiles, they were bridge targets
   - Fix: list 12 actual profiles from _primitives/MANIFEST.toml,
     mention bridges as separate concept

7. .claude-plugin/plugin.json license MIT → Apache-2.0
   - W2-Sonnet reviewer flagged: LICENSE file is Apache-2.0 (since
     2026-04-30 per NOTICE), but plugin.json still declared MIT —
     plugin marketplace would show wrong license

8. docs/ARCHITECTURE.md:318 placeholder URL `https://example.invalid/...`
   - W2-Sonnet reviewer flagged: dead link in published docs
   - Fix: remove the bad href, describe ssl-rule-file as per-user
     install outside the public repo

9. skills/sleep-on-it/SKILL.md Wagner et al. 2004 citation
   - W1+W2 reviewer flagged RULE 0.4 violation: citation without
     verification marker
   - Fix: added [VERIFIED: doi:10.1038/nature02223] + clarification
     that the original paper showed slow-wave-sleep (not strictly REM)
     insight gain — our metaphor is a loose mapping

10. encyclopedia/substrate-overview.md §5 fabricated TS deps
    - W1-Opus doc-consistency flagged RULE 0.4.b violation: 5 of 6
      package rows had INVENTED dependency strings
      (`recall-ai-sdk ^1.0.0`, `nodemailer-mock ^2.0.0`,
       `telegram-typings ^4.10.0`, etc — none exist in the actual
      package.json files)
    - Fix: regenerated table from real `package.json` reads via
      `node -p "require(...).dependencies"` for each of the 6 packages
    - Fix: also corrected version drift (5 packages all 0.14.0 now)

Verification:
- Outcome-only end-to-end install against fake $HOME succeeds:
  hooks installed, ledger schema at user_version=9, settings.json
  created cleanly, all 5 documented files present
  [REAL: ran 2026-05-03 in this session]
- `cargo check -p kei-mcp` + `cargo test --no-run -p kei-mcp` clean

Audit findings NOT yet addressed (deferred to next batch):
- README:65 git clone github URL — repo is private; reviewer flagged
  external strangers cannot clone; will resolve via Quick Start rewrite
- npm.pkg.github.com / @keisei84 leftover sweep — both waves verified
  ZERO refs, no fix needed
- safeEqual timing leak in TS server (W2 sec MEDIUM)
- HTTP server bind 0.0.0.0 (W2 sec MEDIUM)
- Unbounded request body (W2 ci MEDIUM)
- --dry-run silent ignored on non-outcome profiles (W1+W2 MEDIUM)
- Doc-link missing for MEMORY/DNA/LEDGER format specs from README

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 19:09:59 +08:00

108 lines
4.9 KiB
Markdown

# `outcome-only` install profile
> Five-file pitch: install the outcome-tracking primitive without
> committing to anything else. No daemon, no Forgejo, no launchd, no
> hundred Rust crates, no `no-github-push` hook, no agent generation.
> If you do not like what `~/.claude/agents/ledger.sqlite` collects,
> the uninstall is a four-line shell paste at the bottom.
## What gets installed
| # | Path | Source | LOC |
|---|--------------------------------------------------------|---------------------------------------|-----|
| 1 | `~/.claude/hooks/agent-outcome-backfill.sh` | `hooks/agent-outcome-backfill.sh` | 140 |
| 2 | `~/.claude/hooks/error-spike-detector.sh` | `hooks/error-spike-detector.sh` | 89 |
| 3 | `~/.claude/agents/ledger.sqlite` | `install/sql/outcome-only-schema.sql` (or `kei-ledger init`) | n/a |
| 4 | one appended line in `~/.claude/CLAUDE.md` | the STATUS-TRUTH MARKER instruction | 1 |
| 5 | `_primitives/_rust/kei-model-router/target/release/kei-model-router` (deferred) | `_primitives/_rust/kei-model-router/` | n/a |
Plus a jq-merge of two hooks into `~/.claude/settings.json`:
- `PostToolUse:Agent``agent-outcome-backfill.sh`
- `PostToolUse:*``error-spike-detector.sh`
`./install.sh --profile=outcome-only --dry-run` prints exactly this
list and exits 0 without writing.
## What does NOT get installed
- 102 Rust crates (cortex, frustration-loop, sleep-layer, …)
- 67 skills, 37 agent manifests, 82 substrate blocks
- `kei-cortex` HTTP / WS daemon
- Forgejo, dev hub, Datasette, restic, mdbook, gdrive-import
- launchd plists (`disk-reclaim`, sleep-layer cron)
- `no-github-push.sh` hook (or any other Bash gate)
- substrate PATH wiring (no edits to your shell rc files)
If you later want any of those, the kit is incremental: re-run
`./install.sh --profile=core` (or heavier) and the outcome-only state
is preserved verbatim — both paths share `~/.claude/hooks/` and
`~/.claude/agents/ledger.sqlite`.
## How `kei-model-router` activates
The router is a posterior decision rule keyed on per-task-class DNA
plus a Beta posterior over `(success, total)` in `agents.outcome`.
Until you accumulate ~100 outcome rows, the router falls back to
"behaviour unchanged" — every spawn keeps whatever model the agent
manifest declares.
After ~100 rows the posterior dominates the prior and the router
starts producing concrete recommendations. You opt in by adding
`kei-model-router` to a `PreToolUse:Agent` hook later — that step is
**not** done by this profile. You stay in observe-only mode by default.
If `cargo` is on PATH at install time the binary is built into
`_primitives/_rust/kei-model-router/target/release/`. If `cargo` is
missing the build is skipped silently and the install is still
considered complete; rebuild later with:
```bash
cd _primitives/_rust/kei-model-router && cargo build --release
```
## Privacy posture
All outcome rows live in `~/.claude/agents/ledger.sqlite`. They never
leave the machine — no sync hook, no remote-push, no telemetry.
Inspect with:
```bash
sqlite3 ~/.claude/agents/ledger.sqlite \
"SELECT id, branch, status, outcome, stubs_count, started_ts FROM agents
ORDER BY started_ts DESC LIMIT 20;"
```
Uncomfortable with the file? `rm` it; the next install or agent run
recreates an empty schema, no other side effects.
## Uninstall
```bash
rm -f ~/.claude/hooks/agent-outcome-backfill.sh
rm -f ~/.claude/hooks/error-spike-detector.sh
rm -f ~/.claude/agents/ledger.sqlite
rm -f ~/.claude/memory/time-metrics/agent-toolstats.jsonl
# CLAUDE.md cleanup — portable across BSD (macOS) and GNU sed via awk.
# The original line `sed -i.bak '/.../,+1 d'` used the GNU `,+N` address
# extension which BSD sed does NOT support; on macOS it silently no-ops,
# leaving an orphan instruction. The awk recipe below works on both.
awk 'BEGIN{skip=0} /<!-- outcome-only profile \(KeiSeiKit\) -->/ {skip=2; next} skip>0 {skip--; next} {print}' \
~/.claude/CLAUDE.md > ~/.claude/CLAUDE.md.tmp \
&& mv ~/.claude/CLAUDE.md.tmp ~/.claude/CLAUDE.md
```
Both hooks exit 0 immediately when their target script is missing, so
the `~/.claude/settings.json` jq-merge entries are harmless after
`rm`. To scrub those too, drop `agent-outcome-backfill.sh` /
`error-spike-detector.sh` lines from `settings.json` by hand.
The 5th `rm` removes the sidecar telemetry JSONL the backfill hook
writes (per-agent token counts + tool stats; local-only, no network
egress, but worth deleting if you uninstalled for privacy reasons).
## Why this profile exists
A kit with 100 crates / Forgejo / launchd plists is too heavy to
evaluate. A pitch you can read in four minutes and trial in five is
not. This profile is the answer to "what is the smallest version of
KeiSeiKit that still demonstrates the outcome loop?" — and nothing more.