Parfii-bot 784dfbae6f fix(audit-batch-2): regressions from prev batch + 2nd-wave audit findings

12-agent audit (waves 3+4 Opus+Sonnet) on commit 3759fb0 found that 2 of
my prior fixes had regressions, plus the prev batch missed 8 stale-text
sites and 2 latent bugs. This batch closes them all.

== Regressions in audit-batch (3759fb0) — now fixed ==

1. PRAGMA user_version=9 placement — could silently downgrade schema on
   cross-version install (existing v10 DB → re-run reset to 9 →
   migrations replay → ALTER TABLE duplicate-column errors)
   - install/sql/outcome-only-schema.sql: PRAGMA moved OUTSIDE the
     transaction (after COMMIT) for portability across SQLite versions
   - install/lib-profile-outcome-only.sh::_outcome_install_ledger:
     added downgrade guard — reads existing user_version BEFORE running
     ANY init path; if >9, skips entirely (preserves newer schema)
   - VERIFIED: simulated v10 DB → re-run prints "skipping init to
     preserve newer schema"; user_version stays at 10 (was downgraded
     to 9 in the prior batch) [REAL: ran in this session]

2. backup_file mv→cp workaround left orphan backups + bypassed rollback
   contract (BACKUP_PAIRS not registered)
   - install/lib-profile-outcome-only.sh: now manually appends to
     BACKUP_PAIRS so rollback trap restores on later failure;
     removes the .bak on success path
   - Comment updated to explain the workaround vs backup_file mv

3. CLAUDE.md skip-guard "STATUS-TRUTH MARKER" was too broad —
   false-positive on existing kit users (RULE 0.16 doc text matches)
   - lib-profile-outcome-only.sh: changed grep to literal HTML comment
     marker `<!-- outcome-only profile (KeiSeiKit) -->` (specific marker
     written by the installer itself)

== Tier 1 missed in prev batch — now fixed ==

4. _ts_packages/package-lock.json referenced packages/cortex-ui which
   does NOT exist on disk → npm ci would fail with ELSPROBLEMS in CI
   - Regenerated via fresh `rm package-lock.json && npm install`
   - npm ci now exits 0 cleanly [REAL: ran in this session]
   - Lockfile shrunk 2403→0 lines on the cortex-ui section (full regen)

5. v3 triggers (branch length cap ≤256) were MISSING from
   outcome-only-schema.sql — sqlite3 fallback path skipped a schema
   feature that the Rust kei-ledger flow enforces, creating cross-flow
   drift
   - Added trg_agents_branch_len_ins + trg_agents_branch_len_upd
     mirroring migrations_list.rs:30-44
   - Header comment in outcome-only-schema.sql rewritten to match
     current behavior (was stale)
   - VERIFIED: end-to-end install creates 2 triggers [REAL: sqlite3
     .schema | grep trg_agents_branch_len returns 2]

6. README.md:232 said "102 crates" while README.md:9 said "105 crates"
   — internal contradiction in same doc
   - README:232 → "105 workspace crates"

7. ARCHITECTURE.md:165 "53 Rust crates + 13 shell primitives" stale
   - Updated to "105 Rust workspace crates (47 declared in MANIFEST.toml
     `full` profile) + 14 shell primitives"

8. ARCHITECTURE.md:157 "45 /commands" stale
   - Updated to 68

9. plugin.json + marketplace.json description strings still had
   pre-fix counts (23 primitives / 39 skills / 9 hooks / 12 agents)
   - Both rewritten to match README:9 SSoT (38 agents / 68 skills /
     38 hooks / 105 workspace crates / 47 installable + 14 shell)

10. PROFILE-OUTCOME-ONLY.md:28-29 "What does NOT get installed" still
    cited 102/67/37/82
    - Updated to 105/68/38/85

11. encyclopedia/substrate-overview.md §6/§11/§12 still said
    "80-char DNA"; §13 said "495 DNA indices"; §6 said "11 install
    profiles (.../Cursor/Continue/etc)"
    - All 4 sites fixed to current language (≥33-char variable, 565
      DNAs, 12 install profiles)

12. docs/DNA-INDEX.md:1352 said wire format is "(80 chars)"
    - Updated to "(≥33 chars; role + caps slugs are variable — see
      docs/DNA-FORMAT.md)"

== Tier 2 honesty fixes ==

13. Wagner et al. 2004 citation in SLEEP-LAYER.md:26 lacked [VERIFIED]
    marker (W3 doc consistency caught it)
    - Added [VERIFIED: doi:10.1038/nature02223] + clarification that
      the original study did not isolate a specific sleep stage; SWS
      attribution comes from secondary literature (Diekelmann/Born)

14. PHILOSOPHY.md:125 attributed "overnight consolidation of un-finished
    intentions" to Wagner 2004 — that paper is about insight gain on
    the Number Reduction Task, not Zeigarnik-effect cued memory
    - Rewritten to accurately describe Wagner 2004's actual finding +
      [VERIFIED: doi:10.1038/nature02223]

Verification:
- `npm ci` in _ts_packages/ exits 0 [REAL: ran in this session]
- `cargo check --workspace` exits 0 in _primitives/_rust [REAL: ran in
  this session]
- Outcome-only end-to-end fresh install produces user_version=9 +
  2 triggers (correct schema shape)
- Outcome-only re-run against v10 DB preserves user_version=10
  (downgrade guard works)
- CLAUDE.md skip-guard now triggers ONLY on literal marker, not on
  RULE 0.16 phrase

NOT addressed in this batch (deferred to a future round):
- github KeiSei84/{KeiSeiKit, KeiSeiKit-1.0} 404 (user-side action:
  publish repo or update refs)
- keigit user `keisei` does not exist (user-side: create org or
  rename scope)
- KEIGIT_TOKEN secret not configured (user-side action)
- Forgejo registration disabled (admin-side)
- safeEqual timing leak in TS server (LOW per W3 reassessment)
- HTTP bind 0.0.0.0 default (MEDIUM)
- Unbounded request body (MEDIUM)
- Outcome-only confirm-screen bypass (RULE 0.1 spirit)
- Ledger fallthrough false summary
- Node 20 deprecation (deadline 2026-06-02, 30 days)
- Hook count triple-discrepancy (38 README / 53 DNA-INDEX / 35 maturity-row)
- 100-row router claim still in README:117 + PROFILE-OUTCOME-ONLY.md
- INSTALL.md numerics without [REAL:] markers
- Stale .bak files accumulation policy (cosmetic)
- README per-claim [REAL: ] markers for 6 of 7 numerics

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-03 20:30:51 +08:00

12 KiB

Raw Permalink Blame History

Sleep Layer & Session Self-Audit

Day sessions → overnight consolidation → morning report. Three nightly phases on an Anthropic-cloud agent, plus an always-on session retrospective.

The nightly cycle at a glance

The sleep layer is a three-phase nightly cycle on an Anthropic-cloud agent. The three phases run in order on the same scheduled trigger.

                          YOUR NIGHT
        ┌──────────────────────────────────────────────────────┐
Day →→→ │  Phase A            Phase B            Phase C       │ →→→ Morning
        │  INCUBATION         REM                NREM          │
        │  "sleep on it"      consolidation      deep-sleep    │
        │  v0.12.0            v0.11.0            v0.13.0       │
        │  (queued tasks)     (trace patterns)   (conflict     │
        │                                         refactor)    │
        └──────────────────────────────────────────────────────┘
              ↓                   ↓                  ↓
         sleep-results/     reports/sleep-*.md  sleep-deep/*.md
         <uuid>.md          (always)            (every N days)

Biological analog. Your Mac is the hippocampus (fast, stateful, volatile — captures raw episodes). The memory-repo is the transport layer. The cloud agent is the neocortex (slow, stateless, generalising). The morning git pull is the recall. Phase A mirrors the "sleep on it" insight effect (Wagner et al. 2004, Nature 427:352–355 [VERIFIED: doi:10.1038/nature02223]; the original study did not isolate a specific stage — secondary literature attributes the effect primarily to slow-wave sleep, our mapping is loose). Phase B mirrors REM dream-state pattern extraction. Phase C mirrors NREM slow-wave system consolidation.

Phase interaction rules (important):

A marathon task in Phase A (8-hour budget, 1 task only) owns the whole night — Phases B and C are skipped for that night. Traces are append-only, so the next night's Phase B picks up the skipped backlog.
Phase C only fires when today is a multiple of DEEP_SLEEP_CRON_DAYS (default 7) counted from your install date. Anchor lives in sync-repo/reports/install-anchor.txt.
The morning report is for HUMAN review. It is NEVER auto-injected into a Claude Code session. Any rule or hook that emerges from it is installed via /escalate-recurrence — not by the cloud agent.

Governed end-to-end by 5 in ~/.claude/rules/sleep-layer.md.

Session self-audit (4)

KeiSeiKit auto-analyzes sessions on 3 triggers:

Stop event — session ended; session-end-dump.sh archives the JSONL trace and ingests it into kei-memory.
Milestone commits — git commit -m "feat:" / "refactor:" / git merge; milestone-commit-hook.sh appends a one-line session summary to ~/.claude/memory/audit-backlog.md.
Error spike — 3+ errors in the last 20 tool calls; error-spike-detector.sh tags the pattern and logs it.

Findings surface via click-only AskUserQuestion, routing to /escalate-recurrence (codify rule + wiki + hook), /debug-deep (5-phase RCA), or the audit backlog (log-only). Silent-first: the first 10 sessions log only — prompts activate from session 11 onward so the memory store has a useful baseline before it interrupts you. Counter lives in ~/.claude/memory/audit-backlog.md as .

Manual trigger: /self-audit skill (same flow, invoked on demand).

Requires the kei-memory primitive. Included in the dev and full profiles; otherwise add via ./install.sh --add=kei-memory.

Cloud REM sync (v0.11.0) — Phase B

Run a nightly "sleep" cycle on Anthropic's cloud — no laptop, no infra, no DevOps.

How it works:

Each session: your Mac pushes trace JSONL to a private git repo you control
03:00 local time: a remote Claude Code agent clones the repo, analyzes the last 24h of traces, writes reports/sleep-YYYY-MM-DD.md, and commits back
Next morning: git pull and read the consolidated findings

Current state (2026-05-03) — what Phase B does and does not do:

Phase B currently writes a markdown report at ~/Projects/KeiSeiKit-public/reports/sleep-YYYY-MM-DD.md (or the equivalent path inside your sync-repo). The report is intended to be read by a human.

Auto-codification of rules from sleep insights is not yet implemented. The ContractDoc designates /escalate-recurrence as the manual codification path — when you read the morning report and spot a pattern worth turning into a rule, you invoke that skill by hand.

When auto-codification lands, the loop will be:

Phase B detects pattern → opens AskUserQuestion →
  on user-confirm → writes rule + hook stub

This is tracked as a separate atomar; until then, Phase B is report-only and codification is human-in-the-loop. This matches the sleep-layer rule's "no feedback loop into agent state" invariant — nothing the cloud agent writes is auto-injected into a session.

Setup (one-time, ~5 min):

Create an empty private repo on GitHub / GitLab / Bitbucket / self-hosted Forgejo
In Claude Code run /sleep-setup
The wizard generates an SSH deploy key → you paste it into the repo's deploy-key settings with WRITE access
The wizard emits a ready-to-paste /schedule create command, converted to your local 03:00 in UTC

After that, the sleep cycle runs every night automatically. The morning report is yours to read — nothing is auto-injected back into any session.

Requires the kei-memory primitive (shipped in the dev and full profiles; add via ./install.sh --add=kei-memory otherwise). Sleep-sync scripts themselves are installed unconditionally and stay dormant until you opt in via /sleep-setup.

Opt in at install time with ./install.sh --with-sleep-sync (TTY-only). Governed by 5 in ~/.claude/rules/sleep-layer.md.

Sleep on it (incubation, v0.12.0) — Phase A

Defer a hard question or research task to the nightly remote agent: run /sleep-on-it, fill in one free-text field plus three clicks (type / priority / format), submit. The task lands in sync-repo/sleep-queue/ and the nightly agent processes it before REM consolidation.

Priority maps to a wall-clock budget. Pick the one that matches the task's difficulty:

Priority	Budget	When to pick
Quick	15 min, this night	Simple questions, fast lookups
Standard	60 min, this night	Default, medium research
Deep	4 hours, this night	Serious derivations, thorough prior-art
Marathon	Full night (up to 8 h), 1 task only	Hard equations, full autonomy; Phase B REM skipped that night
Weekly batch	60 min, next Sunday UTC	Non-urgent research

Checkpointing: Standard / Deep / Marathon runs commit a .partial.md every 20–30 minutes, so if the cloud session is cut short you still get the partial on morning pull.

Typical use:

"Should I use a continuous-time net for memory re-ranker?" → deep-research → architectural recommendation by morning
"Compare SvelteKit vs Astro vs Next.js App Router for the kit's landing" → comparative study
"Derive closed form for an attractor on a Stiefel manifold" → marathon mode, full night of autonomous derivation
"What patterns in audit-backlog have highest impact?" → pattern analysis

Results in sync-repo/sleep-results/<uuid>.md, linked from the next morning's REM report. Biological analog: the REM-sleep "sleep on it" effect (Wagner et al. 2004, Nature). Queue mutations go through the kei-sleep-queue helper.

Deep-sleep NREM consolidation (v0.13.0) — Phase C

A third nightly phase — Phase C — runs after REM on a user-chosen cadence (default: every 7 days). Biological analog: NREM slow-wave-sleep system consolidation. The remote agent scans your memory-repo for conflicts across rules, hooks, _blocks/, and memory (contradictory directives, overlapping hook matchers, >70%-duplicate blocks, orphaned wikilinks, Constructor-Pattern violations) and produces a structured refactor plan.

4-primitive pipeline, in order:

kei-conflict-scan  →  kei-refactor-engine  →  kei-graph-check  (via kei-store transport)
 (detect)             (propose)                 (verify)         (read/write memory-repo)

kei-conflict-scan reads _rules/, hooks/hooks.json, _blocks/, and memory/ and emits a typed conflict list (name-collision, matcher-overlap, duplicate-block, orphan-wikilink, CP-violation).
kei-refactor-engine groups conflicts by safe-to-auto-resolve vs requires_human_decision and writes the plan + auto-resolve markdown.
kei-graph-check walks every wikilink / block-ref / handoff-ref in the proposed state; if anything fails to resolve, the fork branch is blocked and the plan is annotated.
kei-store is the transport — reads the pre-state from your GitHub / Forgejo / Gitea / FS / S3 backend and writes the two output files back atomically.

Concrete example (real category, paraphrased):

Conflict detected: hook .sh (PreToolUse:Bash, matcher git push) and rule file patents.md (§"Never reference unfiled applications") both govern the same risk surface — a github push containing private language. The hook blocks on URL; the rule blocks on content. Suggested refactor: keep both (they are complementary), but add a cross-ref from patents.md to the hook so a future reader sees the two-layer defence. Auto-resolvable (pure documentation edit, no behaviour change). Written to YYYY-MM-DD-autoresolve.md for human review.

Two output modes, chosen once in /sleep-setup Phase 3b:

Plan only (default) — markdown report in sync-repo/sleep-deep/YYYY-MM-DD-plan.md. Read in the morning, decide what to merge by hand.
Plan + fork — same plan plus an auto-resolve review markdown (YYYY-MM-DD-autoresolve.md) listing the auto-resolvable conflicts with WHY / EXAMPLE / TRADEOFF per item. You open each file in an editor, apply the suggested change, commit on a deep-sleep/YYYY-MM-DD branch, then let the graph-check gate verify the wikilinks still resolve.

v0.14.1 retraction: earlier README claimed a git apply-ready patch. The engine cannot synthesise real unified-diff hunks without reading the source files — that would risk fabricated edits (RULE 0.4). The autoresolve file is now plain markdown reviewed and applied by hand; the "fork" path only automates the rename/move class of ops, not content edits.

Zero-conflict guarantee: any conflict the engine marks requires_human_decision is EXCLUDED from the auto-resolve markdown and listed plainly in the plan. No silent auto-apply of ambiguous changes.

Store backends (picked in Phase 3b, consumed via the new kei-store trait):

Backend	Status	Notes
GitHub private	production	SSH deploy key or PAT; default
Forgejo self-hosted	production	Same wire protocol as GitHub
Gitea self-hosted	production	Same wire protocol
Filesystem only	production	Local `.git`; no push; fastest
S3 / R2 / MinIO	production (v0.21, behind `s3` feature)	Real GetObject / PutObject / ListObjectsV2 via `aws-sdk-s3`. Build with `cargo build -p kei-store --features s3` and set `[s3] bucket = "..."` in `store-config.toml`. AWS default credential chain (env vars → `~/.aws/credentials` → IMDS). Custom endpoint for R2 / MinIO / Wasabi via `KEI_STORE_S3_ENDPOINT` env or `s3.endpoint` TOML field. Binary grows ~5 MB when the feature is on. Omit the feature OR omit `s3.bucket` to fall back to the v0.14 local-manifest stub (still gated by `KEI_STORE_ALLOW_S3_STUB=1`).

Requires the new kei-conflict-scan, kei-refactor-engine, kei-graph-check, and kei-store primitives (shipped in the dev and full profiles). Governed by the Phase C extension of 5 in ~/.claude/rules/sleep-layer.md.

12 KiB Raw Permalink Blame History Unescape Escape