KeiSeiKit-1.0

Author	SHA1	Message	Date
Parfii-bot	aaa8f36e10	perf(ci): P1+P2 — thin-LTO + cu=16 + mold linker (~17min → ~4-5min) Critical-path math (cargo workspace 105 crates × 3 matrix targets): - Current profile: opt-level=z + lto=true + codegen-units=1 = compile cost ~10-20× over default; observed wall-time ~17min/release run - After P1+P2 stack: predicted ~4-5min cold, ~1.5min warm == P1 — _primitives/_rust/Cargo.toml profile.release == - lto: true → "thin" (full LTO is 3-5× slower; thin keeps most opts) - codegen-units: 1 → 16 (parallel codegen restored, was serial) - Binary size cost: ~10-15% larger (acceptable for non-embedded targets) - VERIFIED: cargo check --workspace exits clean [REAL: ran in this session; 0 errors, warnings only] == P2 — mold linker for Linux targets == - New: _primitives/_rust/.cargo/config.toml (7 LOC) * x86_64-unknown-linux-gnu + aarch64-unknown-linux-gnu use clang+mold * macOS targets unaffected (use system ld + LLVM) - New step in .github/workflows/release.yml::build-release: Install mold linker (Linux only) — apt-get mold clang Gate: `if: contains(matrix.target, 'linux')` - Inserted AFTER rust-toolchain BEFORE rust-cache - Predicted gain: link phase 60s → 6s on Linux entries == P3 — explicitly NOT applied == - Path-filter on docs-only commits considered + rejected per task spec: Release tags should always rebuild even if commit only touches docs. Files: - _primitives/_rust/Cargo.toml (+2/-2 LOC) - _primitives/_rust/.cargo/config.toml (NEW, 7 LOC) - .github/workflows/release.yml (+5/-0 LOC, mold install step) [ESTIMATE-HTC: rustc + mold benchmarks claim 3-5× and 5-10× respectively on full release builds — not re-benchmarked on this 105-crate workspace yet; will measure on next v* tag push] NOTE: this commit does NOT retag — keigit publish 401 issue is on the keigit-server side (verified: token works locally, 401 from runner IP) and requires user-side action (fail2ban/Caddy whitelist GitHub Actions IP ranges on 45.77.41.204). After user fixes that, next tag will verify both speed gain AND publish success. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 01:32:29 +08:00
Parfii-bot	3759fb0f64	fix(audit-batch): CI green + RULE 0.4/0.16/0.18 honesty pass 12-agent audit (2 waves Opus+Sonnet, 6 slices each) flagged 3 HIGH-tier issues that BOTH waves agreed on, plus 5 doc-honesty findings. This batch fixes the lot. == CI green (was failing on main `94a7d68`) == - _primitives/_rust/Cargo.toml — workspace tokio gains `io-std` feature (needed by kei-mcp/src/main.rs which calls tokio::io::{stdin,stdout}) - _primitives/_rust/kei-mcp/Cargo.toml — dev-deps tokio gains `test-util` feature (needed by tests/tools_call_timeout.rs for tokio::time::advance and Builder::start_paused). Both verified locally: `cargo check -p kei-mcp` ✓ `cargo test --no-run -p kei-mcp` ✓ (3 test binaries link) [REAL: ran 2026-05-03 in this session] == HIGH-tier audit fixes (consensus across waves) == 1. SQLi escape in agent-outcome-backfill.sh:110 - 4 of 12 agents flagged: TOOL_USE_ID was JSON-derived and interpolated raw into SQL. Allowlist on $SHIPPED protected today but a future case-statement removal opened the surface. - Fix: tiny `_sql_esc` helper that doubles single-quotes (SQL-99 standard escape), applied to SHIPPED + TOOL_USE_ID. STUBS already integer-validated. 2. PRAGMA user_version=9 in install/sql/outcome-only-schema.sql - W1 outcome-only critic flagged: the SQL fallback installed a v9-equivalent flat schema but left user_version=0. A LATER `kei-ledger init` (e.g. when user upgrades to full kit) would re-run migrations v1-v9 and ALTER TABLE ADD COLUMN duplicate-error mid-migration → broken DB. - Fix: set PRAGMA user_version=9 before COMMIT so the binary's migration runner sees current ≥ target and short-circuits. 3. backup_file mv→cp + uninstall macOS-portable awk - W1+W2 outcome-only flagged: lib-backup.sh uses `mv` which DELETES the target before _jq_merge_hooks runs; `\|\| true` swallowed the subsequent jq read-error → silent settings.json loss. - Fix in lib-profile-outcome-only.sh: `cp -p` aside, drop `\|\| true`, return 1 on merge failure (trap restores). - PROFILE-OUTCOME-ONLY.md uninstall used GNU sed `,+1` extension which BSD sed (macOS) does not support — uninstall silently no-op'd on macOS, leaving orphan CLAUDE.md text. - Fix: replace with portable `awk` recipe; also added `rm -f` for the agent-toolstats.jsonl sidecar (privacy completeness). == Doc honesty pass (RULE 0.18 numerics + RULE 0.4 citations) == 4. README.md count drift — verified all values against filesystem: * 102→105 Rust crates (Cargo.toml workspace `members` count) * 67→68 skills (`ls skills/ \| wc -l`) * 35→38 hooks (`grep -c '"command":' settings-snippet.json`) * 37→38 agent manifests (`ls _manifests/.toml \| wc -l`) 82→85 substrate blocks (`find _blocks/ -name '.md' \| wc -l`) 18 capability atoms VERIFIED via `find _capabilities/ -name '.md'` (encyclopedia §3 row count of 17 is in a separate file and is a known internal display issue, not changed in this commit) 495→565 active DNAs (per docs/DNA-INDEX.md header 2026-05-03) Each value now carries a `[REAL: <command>]` style trailer per RULE 0.18. 5. README.md DNA "80-char identity" → "≥33-char variable-length" - W1+W2 reviewer-pass flagged FALSE: docs/DNA-FORMAT.md SSoT says minimum 33 chars; 80 was nowhere in code or spec - Fix in README.md:36 + docs/PHILOSOPHY.md:39 + docs/DNA-INDEX.md:1352 6. README.md "Eleven install profiles (... Cursor / Continue / Zed / Aider / Docker / Nix)" — Cursor/Continue/Zed/Aider/Docker/Nix were never install profiles, they were bridge targets - Fix: list 12 actual profiles from _primitives/MANIFEST.toml, mention bridges as separate concept 7. .claude-plugin/plugin.json license MIT → Apache-2.0 - W2-Sonnet reviewer flagged: LICENSE file is Apache-2.0 (since 2026-04-30 per NOTICE), but plugin.json still declared MIT — plugin marketplace would show wrong license 8. docs/ARCHITECTURE.md:318 placeholder URL `https://example.invalid/...` - W2-Sonnet reviewer flagged: dead link in published docs - Fix: remove the bad href, describe ssl-rule-file as per-user install outside the public repo 9. skills/sleep-on-it/SKILL.md Wagner et al. 2004 citation - W1+W2 reviewer flagged RULE 0.4 violation: citation without verification marker - Fix: added [VERIFIED: doi:10.1038/nature02223] + clarification that the original paper showed slow-wave-sleep (not strictly REM) insight gain — our metaphor is a loose mapping 10. encyclopedia/substrate-overview.md §5 fabricated TS deps - W1-Opus doc-consistency flagged RULE 0.4.b violation: 5 of 6 package rows had INVENTED dependency strings (`recall-ai-sdk ^1.0.0`, `nodemailer-mock ^2.0.0`, `telegram-typings ^4.10.0`, etc — none exist in the actual package.json files) - Fix: regenerated table from real `package.json` reads via `node -p "require(...).dependencies"` for each of the 6 packages - Fix: also corrected version drift (5 packages all 0.14.0 now) Verification: - Outcome-only end-to-end install against fake $HOME succeeds: hooks installed, ledger schema at user_version=9, settings.json created cleanly, all 5 documented files present [REAL: ran 2026-05-03 in this session] - `cargo check -p kei-mcp` + `cargo test --no-run -p kei-mcp` clean Audit findings NOT yet addressed (deferred to next batch): - README:65 git clone github URL — repo is private; reviewer flagged external strangers cannot clone; will resolve via Quick Start rewrite - npm.pkg.github.com / @keisei84 leftover sweep — both waves verified ZERO refs, no fix needed - safeEqual timing leak in TS server (W2 sec MEDIUM) - HTTP server bind 0.0.0.0 (W2 sec MEDIUM) - Unbounded request body (W2 ci MEDIUM) - --dry-run silent ignored on non-outcome profiles (W1+W2 MEDIUM) - Doc-link missing for MEMORY/DNA/LEDGER format specs from README Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 19:09:59 +08:00
Parfii-bot	4e99057d2b	fix(perf): bound per-user lock LRU + stream-cap atom subprocess output Two resource-exhaustion fixes from Opus Rust + Sonnet Rust audits. 1. kei-cortex per_user_locks DashMap unbounded growth (HIGH) File: kei-cortex/src/state.rs Bug: per_user_locks: DashMap<String, Arc<Mutex<()>>> inserted on every distinct user_id; never evicted. Auth'd attacker with 1M unique user_ids could OOM the daemon (~150 bytes/entry = 15GB at 100M entries). Fix: replaced DashMap with tokio::sync::Mutex<LruCache<String, Arc<TokioMutex<()>>>> capped at PER_USER_LOCK_CAP = 1024. Eviction is safe because callers hold their own Arc clone for their critical section; dropping the registry slot retires only the registry's reference. Used tokio::sync::Mutex for the registry because LruCache::get mutates the recency list and requires &mut self. Constructor Pattern: state.rs split into state.rs (184 LOC) + state_factories.rs (64 LOC, new). Tests added: user_lock_evicts_past_cap (registry stays ≤1024 after 2048 inserts), user_lock_keeps_most_recent (LRU recency preserved). Existing user_lock_is_stable_per_user + user_lock_differs_per_user updated to async — sole call site (handlers/portrait.rs) gains .await. 2. kei-runtime stdout/stderr cap was post-hoc (HIGH) File: kei-runtime/src/invoke.rs Bug: wait_with_output() buffered ALL child stdout/stderr; only cap_bytes truncated AFTER the child finished. A malicious atom writing 10 GB stdout (or a buggy one looping infinitely) OOM'd the runtime BEFORE the cap fired. Fix: replaced wait_with_output() with two reader threads sharing KillHandle = Arc<Mutex<Option<Child>>>. Each reader appends bytes up to STREAM_CAP = 16 MiB; on cap exceedance the reader KILLS the child from inside the reader thread (critical — otherwise the unbounded writer would never EOF and a post-hoc kill would never fire). Both readers drain the closing pipe to EOF and return. Truncation surfaces as InvokeError::SubprocessError with explicit "exceeded N byte cap" message. Constructor Pattern: invoke.rs decomposed into invoke.rs (159 LOC) + invoke_io.rs (146 LOC, new) + invoke_error.rs (54 LOC, new). Test added: invoke_kills_runaway_atom — stages a kei-flood script running cat /dev/zero, verifies (a) non-zero exit, (b) stdout < 18 MiB, (c) "cap"/"subprocess" in stderr. cargo check --workspace: clean. cargo test -p kei-cortex -p kei-runtime --test-threads=1: 471 pass / 0 fail. Pre-existing openai_loop_wiring.rs parallel-run flake (state collision when test-threads>1) is unrelated and unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 15:39:50 +08:00
Parfii-bot	06ff2f8ed4	fix(auth): Google OIDC account-takeover (CVE-2023-7028 class) — email_verified gate + sub as user_id + id_token cross-check Opus Cross-cutting audit found a classic OIDC account-takeover hole in kei-auth-google::verify(). Same class as the public Booking.com / Slack / GitLab pattern. Root cause: verify() accepted info.email from userinfo response as user_id WITHOUT checking info.email_verified. A Google Workspace admin can mint accounts with arbitrary unverified email aliases. Attacker then OAuth-flows into the relying party using a victim's email as their alias and gets a session bound to that user_id. No email verification = no auth. Fix in 3 layers (defense in depth): 1. email_verified GATE - client.rs: UserInfo gains email_verified: bool with #[serde(default)] — absent field defaults to false (fail-closed). - error.rs: new Error::EmailNotVerified variant. - provider.rs::verify(): rejects with EmailNotVerified before any session is built when email_verified != true. 2. sub AS PRIMARY user_id - provider.rs::verify(): user_id = info.sub (Google's stable account id), NOT info.email. Email is now mutable metadata only. Email reassignment in Google Workspace cannot redirect an existing user_id binding. 3. id_token.sub CROSS-CHECK - id_token.rs (new, 104 LOC): JWT-claims-only extract_sub() — parses base64-payload without signature verification (signature verification against Google JWKS is a documented follow-up atomar). - provider.rs::verify(): when TokenResponse.id_token is present, decode claims and require id_token.sub == userinfo.sub. New Error::IdSubMismatch + IdTokenMalformed variants. - This adds defense against a forged userinfo response even though signature is not yet verified. Constructor Pattern compliance: provider.rs split into provider.rs (181 LOC) + verify_helpers.rs (114 LOC, with unpack_challenge / check_state / enforce_email_verified / cross_check_id_token_sub helpers). All files <200 LOC, all functions <30 LOC. Tests added: tests/google_security_regression.rs (164 LOC, 5 dedicated CVE-2023-7028 regression tests). All 26 tests pass: - verify_rejects_unverified_email - verify_rejects_missing_email_verified_field - verify_uses_sub_not_email_as_user_id - verify_rejects_id_token_sub_mismatch - verify_accepts_matching_id_token_sub cargo check --workspace clean. cargo test -p kei-auth-google: 26/26 pass. Follow-up: JWT signature verification against Google's JWKS endpoint with kid-based key cache + RS256/ES256 — separate atomar (~150 LOC). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 15:38:53 +08:00
Parfii-bot	71f17337fe	fix(security): cortex /term env_clear + bind guard, agent-stub-scan stdin, magiclink revoke Three independent security hardenings from cross-cutting audits. 1. cortex /term PTY env leak + bind guard (HIGH — Sonnet Cross-cutting + Opus) - kei-cortex/src/handlers/term_pty.rs: PTY spawn was inheriting daemon's full process env (KEI_AUTH_KEY, ANTHROPIC_API_KEY, FAL_KEY, etc.) into every authenticated /term shell. Combined with default cors_origin = https://keisei.app, one stored XSS on keisei.app + one bearer token = full local shell with all daemon secrets. Added apply_safe_env() helper: env_clear() + re-set only HOME, PATH, USER, LANG, TERM. Spawn helper invokes it before spawn_command. - kei-cortex/src/main.rs: extracted build_config() helper; added enforce_loopback_or_local_cors() guard called before serve.bind. Refuses to start if bind addr is non-loopback AND cors_origin is a public domain — prevents the XSS-to-shell scenario in production. 2. agent-stub-scan.sh stdin parsing (HIGH — multiple audits) - hooks/agent-stub-scan.sh: previously read $CLAUDE_AGENT_TRANSCRIPT env var which Claude Code does NOT set on PostToolUse:Agent. Hook silently exited 0 — RULE 0.16 enforcement was dead-code in production. Rewrote to read stdin JSON via jq, flatten .tool_response recursively (string\|array\|object via the same pattern as agent-event-done.sh), guard on .tool_name == "Agent" and command -v jq. Maintained WARN-tier exit-0 with TODO marker for ENFORCE flip on 2026-05-05 (per RULE 0.16 §2 ladder). 3. magiclink revoke() silent no-op (HIGH — Opus Rust + Sonnet Cross-cutting) - kei-auth-magiclink/src/{error,provider}.rs: revoke() previously returned Ok(()) without doing anything. Operators expecting "revoke a session" semantics from the AuthProvider trait got false success. Stolen magic- link URLs remained valid until the 15-minute TTL. Added Error::Unsupported variant. revoke() now returns Err(Unsupported(...)) with explicit guidance: "rotate KEI_MAGICLINK_HMAC_ KEY to invalidate all live tokens, or maintain a deny-list at the caller layer". Test provider_revoke_returns_unsupported_error confirms the error variant is wired. Tests: cargo check + cargo test both PASS. 444 functional tests across kei-cortex (428 lib) + kei-auth-magiclink (16 lib + smoke). Pre-existing openai_loop_wiring.rs 502 failures in routes/openai/{chat,responses}.rs are NOT introduced by these fixes — separate unrelated triage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 15:38:23 +08:00
Parfii-bot	a0b1eca6d9	chore: strip dangling sibling refs from Cargo.toml descriptions Opus TOML audit found 7 crates whose Cargo.toml description fields advertised sibling crates that don't exist in the workspace: - kei-auth-magiclink, kei-auth-webauthn → mentioned kei-auth-{github,microsoft} (workspace has only google + apple + magiclink + webauthn) - kei-notify-discord → mentioned kei-notify-email (workspace has telegram / discord / slack / sms only) - kei-net-wireguard, kei-net-ipsec → mentioned kei-net-tailscale (workspace has wireguard / openvpn / ipsec only) - kei-git-forgejo → mentioned kei-git-keigit (workspace has forgejo / gitea / gitlab / bitbucket) - kei-compute-linode → mentioned kei-compute-hetzner (Hetzner removed per rules/projects/project-vortex.md after TSPU blocks) - kei-provision/Cargo.toml description + metadata → both mentioned Hetzner Updated each description to mention only actually-existing siblings. cargo metadata consumers, IDE tooltips, and any future crates.io publication will no longer carry misleading sibling lists. cargo check --workspace clean (only pre-existing warnings unrelated to this change). Description-only metadata edits — zero functional impact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 15:37:33 +08:00
Parfii-bot	7cc544fd85	chore: author email + Cargo metadata SSoT (parfionovich@keilab.io) Two related changes: 1. Author email update across the kit - All `info@greendragon.info` references replaced with `parfionovich@keilab.io` - Touched: NOTICE, README.md, _ts_packages/package.json (and 5 adapter packages), plus 90+ Cargo.toml files - Apache-2.0 attribution unchanged (Denis Parfionovich, 2026) 2. Cargo workspace.package SSoT for author/license/repository/homepage - Added to [workspace.package]: authors = ["Denis Parfionovich <parfionovich@keilab.io>"] license = "Apache-2.0" repository = "https://github.com/KeiSei84/KeiSeiKit-1.0" homepage = "https://github.com/KeiSei84/KeiSeiKit-1.0" - All ~89 member crates migrated from inline declarations to: authors.workspace = true license.workspace = true (repository/homepage where applicable) - Closes audit gap: kei-graph-stream, kei-cortex, kei-shared previously had no license field at the crate level, blocking `cargo publish` on those. Now they inherit Apache-2.0 from workspace. - kei-scheduler/Cargo.toml: removed stray duplicate `authors` line introduced by an earlier migration sweep. cargo check --workspace: clean. No code changes; metadata-only migration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 13:55:28 +08:00
Parfii-bot	23b818a682	fix(auth): SecretString redacted Serialize + PKCE verifier wired Two findings from KeiSeiKit2.0 pr-review (~/Projects/KeiSeiKit2.0/skills/pr-review) applied to commit range 897d010..HEAD. 1. BLOCKER — SecretString silently leaked plaintext via Serialize. File: _primitives/_rust/kei-runtime-core/src/secrets.rs Was: derive(Serialize) + serde(transparent) -> serde_json::to_string(&secret) emitted the raw plaintext in any parent struct with #[derive(Serialize)]. Debug was redacted but Serialize was not. Defeated the type's purpose. Now: manual Serialize impl always emits literal "<redacted>". Deserialize derive kept (callers need to read secrets from config/env). Test serialize_emits_redacted_literal asserts JSON output is "\"<redacted>\"". 2. WARNING — PKCE code_verifier dropped before token exchange. build_auth_url generated code_challenge = SHA256(verifier) but verify() never threaded the verifier to the token endpoint. Token exchange submitted no code_verifier, defeating the PKCE protection. Files: - _primitives/_rust/kei-runtime-core/src/traits/auth.rs: AuthChallenge::OAuthCode now carries code_verifier: Option<String>. Caller stores verifier alongside state in their session-store, exactly as they already store state for CSRF check. - _primitives/_rust/kei-auth-google/src/provider.rs: verify() destructures code_verifier and passes to client.exchange_code(...). - _primitives/_rust/kei-auth-apple/src/provider.rs: same change. Tests added (wiremock body assertions): - google_smoke / apple_smoke: assert exchange request body contains code_verifier=<value> when challenge carried Some(verifier). - existing tests updated to construct OAuthCode { ..., code_verifier: None }. Test split (Constructor Pattern 200 LOC): - apple_smoke.rs grew over 200 LOC after PKCE test addition. Split into apple_smoke.rs (provider tests) + apple_client_smoke.rs (client tests). - same for google_smoke.rs / google_client_smoke.rs. Test results: 31 passed; 0 failed across kei-auth, kei-auth-apple, kei-auth-google, kei-runtime-core unit + integration tests. cargo check --workspace clean. Breaking change: any caller that constructs AuthChallenge::OAuthCode outside this workspace must add code_verifier field (None for legacy no-PKCE; Some for PKCE). Compile-time surfaced gap, not runtime regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 23:49:10 +08:00
Parfii-bot	baf54250a9	fix(substrate): dangling handoffs + atomar manifest fill-out + validator extension Group F — manifest, capability, role, and assembler cleanup (post-audit 2026-05-02). Dangling handoff targets stripped: - validator.toml: removed handoffs to physics-deriver, patent-compliance - code-implementer.toml: removed physics-deriver handoff - architect.toml: removed physics-deriver - ml-implementer.toml: removed physics-deriver, fixed "multi-node multi-node" typo - ml-researcher.toml: removed physics-deriver, patent-researcher - researcher.toml: removed patent-researcher None of those manifest files exist in _manifests/. Comments added explaining the removal date for future re-authoring. Validator extension (_assembler): - src/validator.rs: extended validate() with check_handoff_targets — every [[handoff]].target must point to existing _manifests/<name>.toml. Future dangling handoffs blocked at validate time. - src/validator_tests.rs (new, 133 LOC): unit tests for handoff-target check. - tests/fixtures/_manifests/: added valid stubs for previously-missing manifests (architect, critic, security-auditor, validator, ml-implementer, ml-researcher, infra-implementer) so existing fixtures pass the new validator gate. - tests/snapshots/: insta snapshots updated for researcher + code-implementer. Atomar manifest fill-out (replaced stock copy-paste with domain-specific): - code-implementer-typescript: Drizzle/Zod/Next.js semantics - code-implementer-go: mesh networking, embedded servers - code-implementer-swift: SwiftUI, SPM, macOS menubar - code-implementer-python: RULE 0.2 exception language - code-implementer-flutter: Riverpod, Clean Architecture - infra-implementer-cicd/iac/container/secrets: tool-specific bans + scopes - researcher-web/code: output_extra_fields fixed (was code-implementer copy-paste "Largest file LOC", "Tests pass count" — now sources cited / evidence grade / gaps section) Capability schema completeness: - policy/no-git-ops + quality/cargo-check-green: added stage = "runtime" - 8 capabilities: added explicit parents = [] (was missing/inconsistent) Role schema: - _roles/auditor.toml + merger.toml: added [taxonomy] + [lineage] (was missing) - _roles/explorer.toml: added comment that "Explore" is the canonical Claude Code subagent type (case-sensitive) Reference path cleanup (manifest references): - critic.toml: ~/.claude/skills/architecture-rules/... -> path:user-skills/... - researcher.toml: stripped ~/.claude/agents/validator.md (machine-local) Misc: - frontend-validator.toml: renumbered duplicate step 6 -> step 7 kei-registry test fixture suppression: - tests/fixtures/{atom-sample,fake-kit,mini-kit}/.kei-registry-ignore (3 new files) - DNA-INDEX.md was inflating atom count by ~10% from test fixture rows; ignore-file hooks ready, kei-registry walker implementation is a follow-up. Tests: 59 passed; 0 failed; 1 ignored (pre-existing #[ignore]). cargo check clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 21:41:16 +08:00
Parfii-bot	da4d88910a	chore(workspace): SSoT inheritance + version unification Group E — Cargo workspace hygiene (post-audit 2026-05-02). Workspace dependency inheritance: - 40+ member crates migrated from inline dep pinning to { workspace = true }. Was: every crate redeclared clap/serde/rusqlite/tokio/etc inline, defeating the [workspace.dependencies] SSoT and forcing N edits per upgrade. Authoritative pins now live solely in _primitives/_rust/Cargo.toml. Major version splits resolved: - dashmap: 5 vs 6 (kei-cortex/kei-gateway) -> 6 in workspace - tower: 0.4 vs 0.5 (kei-cortex/kei-forge) -> 0.5 in workspace - notify: 6 vs 8 (kei-projects-watcher/kei-watch+kei-skills) -> 8 in workspace - thiserror: 1 vs 2 (workspace/keisei) -> kept 1; keisei downgraded Closed: dual-major compilation = wasted build time + ABI mismatch risk at trait boundaries. Profile / orphan cleanup: - kei-changelog/Cargo.toml: deleted [profile.release] block (workspace member profiles are silently ignored by Cargo since 1.0). - kei-brain-view/Cargo.toml: removed dangling "[workspace] table stripped on merge" comment (orphan from prior decomposition). rust-version SSoT: - 27+ member crates migrated from inline rust-version = "1.75" to rust-version.workspace = true. Workspace declares 1.77; the inline 1.75 pins were stale and misleading (with resolver 2 the workspace MSRV won anyway). cargo check --workspace: clean (only pre-existing sqlx-postgres future-incompat warning + frustration-matrix dead-code warning, neither introduced by this change). Note: _assembler/ lives outside _primitives/_rust workspace, so its Cargo.toml was not touched here. Remaining edition-2024 question for _assembler is a separate decision. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 21:40:46 +08:00
Parfii-bot	cb1090bef3	fix(security): RCE allowlist + WebSocket auth + SSH option-injection Group D — three independent security primitives hardening (post-audit 2026-05-02). kei-runtime — atom invoke RCE allowlist: - invoke.rs: is_safe_crate_name validator (regex ^kei-[a-z][a-z0-9-]+$); rejects /, \\, .., :, absolute paths, empty, >128 chars. InvalidAtom error variant. stdout/stderr capped at 16 MiB (was unbounded). - main.rs: InvalidAtom mapped to exit code 2. - tests/invoke_exit_codes_smoke.rs: invoke_unsafe_crate_name_exits_2 added. - Closes: any user able to write atoms/*.md with crate_name: "rm" or "sudo" triggered arbitrary command execution. kei-graph-stream — WebSocket bearer + Origin: - auth.rs (new, 142 LOC): token load + bearer extraction + Origin allowlist + ConstantTimeEq compare; 8 unit tests. - ws.rs: ws_handler validates Origin + bearer before upgrade (403/401 on failure). - main.rs: --public-bind-i-accept-the-leak flag required for non-loopback bind; else bail!() with explicit error. - tests/smoke.rs: rewritten with Origin + bearer headers via connect_async_with_config. - Closes: WebSocket /stream had zero auth, zero Origin check; browser CSWSH could subscribe to agent activity broadcast; KEI_GRAPH_STREAM_BIND env silently accepted any SocketAddr. kei-compute-baremetal — SSH option injection (CVE-2023-51385 class): - ssh.rs: is_safe_user + is_safe_host validators (alphanumeric + -_.; reject leading -; max 64 chars; no @, :, /, \\, space). - ssh.rs: -- sentinel before user@host argv (OpenSSH 9.6+ stops flag parsing). - ssh.rs: StrictHostKeyChecking=yes default; KEI_BAREMETAL_ACCEPT_NEW=1 for TOFU. - error.rs: InvalidRegion variant. - provider.rs: validators applied in target_for_spec + target_for_handle. - Closes: spec.region "-oProxyCommand=evil" triggered local RCE before TCP connect. Test results: 29 passed; 0 failed across all three crates. cargo check clean. Findings: RCE allowlist (Wave-A) + WebSocket auth (Wave-B) + SSH injection (Wave-B) were unique-per-retest discoveries. None present in original wave-1 audit. Note: kei-compute-baremetal/src/provider.rs at 300 LOC (was 268; +32 from validators). Pre-existing >200 LOC violation, fix scope was security-additions only. Follow-up: split provider.rs into provider.rs (<200) + provider_tests.rs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 21:40:24 +08:00
Parfii-bot	9aa29aca15	fix(kei-cortex): SSRF + atomic token + body limits + capped reads Group C — kei-cortex daemon security hardening (post-audit 2026-05-02). - fal_ssrf.rs (new): validate_fal_url whitelist (fal.ai/.media/.run only). Applied to upload_url, file_url, status_url, images[0].url, and download_image. Closes SSRF where compromised fal response could direct daemon to fetch IMDSv1 (169.254.169.254) and stream cloud creds. - fal_pipeline.rs (new): HTTP step functions extracted from fal.rs; fal.rs trimmed to thin orchestrator (101 LOC, was over 200 LOC limit). - auth.rs: save_token now writes to <path>.<nanos>.tmp + sync_all + rename. Was non-atomic OpenOptions truncate+write — crash mid-write produced empty token file -> bootstrap rotated -> stale clients locked out. - routes.rs + routes_auth.rs (new): explicit DefaultBodyLimit per route — chat 256 KiB, tool/apply 11 MiB, pet/interaction 64 KiB, tts 32 KiB. Bearer auth middleware extracted to routes_auth. - handlers/chat.rs: validate_body enforces MAX_MESSAGE_CHARS = 50_000. Closed cost amplification where 1.99 MiB chat message billed 500K tokens ($1.50/turn at Sonnet pricing) on every send. - anthropic_sse.rs: SseParser MAX_BUF = 1 MiB cap; was unbounded — peer streaming 1 GB without \\n\\n would OOM daemon. - http_helpers.rs (new): HTTP_CLIENT: Lazy<reqwest::Client> shared across handlers (was per-request Client::new() => 100-300ms TLS handshake per chat turn, no HTTP/2 multiplexing, fd leak risk on macOS TIME_WAIT). - http_helpers.rs::read_capped: per-response body cap (16 KiB error / 64 MiB success). Applied to anthropic, anthropic_invoker, elevenlabs, fal_pipeline. Closed unbounded resp.text() / .bytes() pattern that compromised upstream could exploit. Test results: 462 passed; 0 failed (single-threaded). cargo check clean. 2 pre-existing port-binding flakes in openai_loop_wiring tests are unrelated. Findings consensus: fal SSRF + body-size + bearer-token-atomicity appeared in Wave-A retest; chat message cap + SSE buf cap appeared in Wave-A only. Would have been missed by single audit pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 21:39:57 +08:00
Parfii-bot	8b0401b9db	feat(auth): JWT verification + OAuth CSRF + PKCE + secret redaction Group B — auth-crate security hardening (post-audit Sonnet test-retest 2026-05-02). kei-auth-apple: - jwt.rs: full ES256 JWKS signature verification (jsonwebtoken crate); validates iss == https://appleid.apple.com, aud == client_id, exp, iat; decode_id_token_unverified is now cfg(test)-only. Module docstring promised this since v0.1 — now actually implemented. - claims.rs (new): IdTokenClaims + AudClaim extracted from jwt.rs. - error.rs: JwtVerify, JwtDecode, MissingClaim variants. - client.rs: client_secret_jwt: SecretString (was String); exchange_code accepts code_verifier: Option<&str> for PKCE. - provider.rs: verify() does CSRF expected_state ConstantTimeEq + JWT verification; build_auth_url accepts state + verifier and emits PKCE code_challenge. - tests/apple_smoke.rs + helpers/: 6 tests including malformed-JWT + non-Apple OAuth + 400-mapping + provider_verify_csrf_mismatch_rejected. kei-auth-google: - pkce.rs (new): pkce_challenge + url_encode (RFC 7636 §B.1 test vector covered). - client.rs: client_secret: SecretString; exchange_code accepts code_verifier. - provider.rs: verify() rejects on state mismatch; build_auth_url emits S256 challenge. - tests/google_smoke.rs: 7 tests including CSRF mismatch. kei-auth: - main.rs: resolve_token() supports stdin (-) and KEI_AUTH_TOKEN env. Token positional arg leaked via /proc/<pid>/cmdline + shell history; same fix that v0.14.1 applied to --key. - main.rs::key(): hard fail if KEI_AUTH_KEY len < 32 bytes (mirror of magiclink). - tokens.rs::verify(): query_row(...).optional()? instead of .ok() — DB errors now propagate instead of being swallowed as "token unknown". kei-runtime-core: - secrets.rs (new, 81 LOC): SecretString newtype with redacted Debug + zeroize-on-Drop. Required by every auth crate that holds secret material. - traits/auth.rs: AuthChallenge::Password.password is now SecretString; OAuthCode { state, expected_state }. - error.rs: CsrfStateMismatch variant. Test results: 48 passed; 0 failed across kei-auth, kei-auth-apple, kei-auth-google, kei-auth-magiclink, kei-runtime-core. cargo check --workspace clean. Findings consensus: Apple JWT unverified + OAuth state CSRF appeared in all 3 audit waves (Wave-1 + Wave-A + Wave-B); PKCE absence + secret-derive-Debug appeared only in Wave-A retest, would have been missed by single-pass audit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 21:39:18 +08:00
Parfii-bot	52a02dfbff	feat(live-graph): WebSocket activity stream — orchestrator-centric live view User pushback: "транслирует в онлайне какие агенты создаются? основное окно агента, а дальше при запусках появляются новые ветки, мы показываем в онлайне как агенты собираются и работают" Earlier `kei-graph-export` rendered the static SUBSTRATE (all 581 atoms, catalog-style). User wanted the LIFECYCLE: orchestrator at center, every new agent as a fading-in branch, every tool call as a pulse, every completion as a fade-out. TTL = until done; pure online, no history accumulation per user direction. Three-layer architecture, all conforming to schema /tmp/agent-events-schema.md: LAYER 1 — Event emitters (4 hooks) hooks/agent-event-spawn.sh PreToolUse:Agent → agent_spawn event hooks/agent-event-done.sh PostToolUse:Agent → agent_done event (parses STATUS-TRUTH MARKER for outcome, computes cost_usd from token×pricing table) hooks/tool-use-event.sh PreToolUse:Bash\|Read\|Edit\|Write\|Grep\|Glob\|NotebookEdit → tool_use event hooks/skill-record.sh EXTENDED — second emit step writes skill_use event in addition to existing kei-ledger record-skill call All 4 are POSIX /bin/sh, defensive (never block, exit 0), bypass via KEI_EVENTS_BYPASS=1. Append-only JSONL to ~/.claude/memory/agent-events.jsonl. Smoke: 4 synthetic invocations cover spawn/done/tool/filter cases. LAYER 2 — kei-graph-stream Rust daemon _primitives/_rust/kei-graph-stream/ (~480 LOC, 5 files + 1 test) - Tails events.jsonl every 200ms (poll-based, no notify dep). - Parses each event, updates AliveState (insert on spawn, remove on done). - Broadcasts {"type":"event","data":<event>} to all WebSocket clients. - On client connect: sends {"type":"snapshot","alive":[...]} first. - Heartbeat: {"type":"ping"} every 30s. - axum 0.7 + ws feature (already in Cargo.lock via kei-cortex). - Bypass: KEI_GRAPH_STREAM_BYPASS=1. Bound to 127.0.0.1:8201 (loopback only). Endpoints: GET /stream → WebSocket upgrade GET /health → "kei-graph-stream alive" 4 unit + 1 integration test. cargo build clean. Installed binary: ~/.cargo/bin/kei-graph-stream Launchd plist: io.keisei.graph-stream (RunAtLoad, KeepAlive) Loaded as PID 52678, /health 200 OK verified. LAYER 3 — live-graph.html (single-file frontend) ~/Projects/lbm-graph-viz/live-graph.html (~464 LOC, self-contained) - SVG full-viewport, dark #0f172a, CSS grid background. - Pinned center node "main" (orchestrator), gold #fbbf24, glowing. - Agents radiate via D3 force-simulation; color-by-model (sonnet=green, opus=red, haiku=blue, default=gray). - On agent_spawn: fade-in 300ms, edge from main to new node. - On tool_use: pulse on agent node (r 8→12→8 over 400ms) + floating tool name label fades 800ms. - On agent_done: outcome-color flash → fade-out 800ms → remove. - WebSocket client: ws://127.0.0.1:8201/stream, exponential-backoff reconnect (1s→30s). - Top-right status badge: ● connected \| ○ reconnecting \| ✕ disconnected. - Bottom counters: alive / spawned / tool calls / done / last event age. - No build step. D3 v7 from CDN. Pure HTML+JS+CSS. End-to-end smoke (this machine, just now): - daemon health 200 OK - hook injected agent_spawn → daemon broadcasts → AliveState=1 - hook injected agent_done → daemon broadcasts → AliveState=0 - frontend file syntax-checked clean What this does NOT do (deferred, by user direction "это онлайн"): - History persistence — agents who finished are GONE from the graph. Per-session log remains in events.jsonl + sleep-sync if user wants to consult later, but the live view is RIGHT NOW only. - Sub-agent attribution beyond "main" — orchestrator-direct tool calls show on the orchestrator node. Sub-agent's internal tool calls would need session-id correlation; current schema has agent_id="main" placeholder for non-Agent tool calls. - Replay mode — no time-scrubber. Possible follow-up if useful. - Auth on WebSocket — bound to 127.0.0.1 only. Local-only by design. === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: PASS behaviour-verified: yes follow-up-required: - Sub-agent tool-call attribution (correlate session_id chain) - Replay mode with time scrubber (if user finds use) - Tool aggregator nodes ("Bash bucket" with N) instead of per-agent pulses Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 13:30:24 +08:00
Parfii-bot	a31a056f61	feat(graph): live runtime DNA viewer — kei-graph-export + lbm-graph-viz adapter User pushback: "можно нашего Кейси подключить к обсидиан? будет в онлайне строить граф из всех наших агентов?" Closer-to-question architecture: don't build new Obsidian plugin — re-use the legacy `~/Projects/lbm-graph-viz/` D3 viewer (lineage: keicode → living-graph → lbm → lbm-graph-viz → keisei-graph). Strip its Hebbian/co-change edges, replace with DNA-derived edges from the kei-registry + kei-ledger. Open in any browser, file://...index.html. NEW Rust crate `_primitives/_rust/kei-graph-export/` (~440 LOC, 5 files) Reads: ~/.claude/registry.sqlite (730 active blocks) ~/.claude/agents/ledger.sqlite (6 agents post-cleanup) _manifests/*.toml (38 agent manifests) Emits 581-node, 291-edge graph. Edge types: block_dep 171 manifest → atom (blocks=[]) path_ref 99 manifest → atom (path:NAME refs) branch_lineage 11 parent_branch → branch agent_uses_manifest 10 agent → manifest (slug from branch name) Output formats: --format spaces-fragment → `window.RUNTIME_SPACE = {...}` JS file --format json → raw {nodes, links} for downstream tools Block-name lookup is multi-resolution: each block is registered under display name + lowercased + file-stem slug (from path basename) so manifest references like `blocks = ["baseline"]` resolve to a registry row whose `name` column holds "BASELINE — inherit from Main Claude". Without this fix the graph had 0 block_dep edges; with it, 171. NEW background updater `hooks/graph-export-watcher.sh` + launchd plist template `_primitives/templates/io.keisei.graph-export.plist` 5-second loop: while true; do kei-graph-export --format spaces-fragment --output <viz>/data-runtime.js.tmp mv <viz>/data-runtime.js.tmp <viz>/data-runtime.js # atomic sleep 5 done launchd plist substitutes `HOME_DIR` and `HOOKS_DIR` placeholders at install time. RunAtLoad=true, KeepAlive=true. Logs to ~/.claude/memory/graph-export.log. Bypass: GRAPH_EXPORT_BYPASS=1. Loaded into user-side launchd (PID 16474 confirmed running). File mtime advances every 5s — live updates verified. PATCH `~/Projects/lbm-graph-viz/index.html` (outside kit, surgical) Three changes: 1. Add `<script src="data-runtime.js">` BEFORE `spaces.js` (window global available when SPACES is defined). 2. After spaces.js: `if (window.RUNTIME_SPACE) SPACES.runtime = window.RUNTIME_SPACE;` 3. Auto-refresh setInterval(5s): fetch data-runtime.js, eval (re- assigns window.RUNTIME_SPACE), hash-compare, re-render via `rebuildGraph()` if currently viewing the runtime space. window.RUNTIME_SPACE (not const RUNTIME_SPACE) avoids the "const cannot be re-declared" error on subsequent eval() calls. Effect: open file://~/Projects/lbm-graph-viz/index.html in any browser, switch to "Runtime" space — full DNA graph of every agent / atom / skill / branch / manifest / hook / primitive / rule, force- laid-out by D3. Updates every 5 seconds without page reload. What this does NOT do (deferred): - Obsidian mirror — separate work, would emit .md per node into ~/Projects/KeiSeiVault/. Useful for backlinks navigation but file-watcher latency similar to current 5s polling. - Skill-invocation edges — table is empty until next Skill tool use; will populate naturally. - Scoped queries (orphan finder, hot-path PageRank). Out of scope for v1; the JSON --format export feeds any downstream tool. - `agent_uses_manifest` heuristic warns on unknown subagent slugs (e.g. `physics-deriver` with no manifest yet). Non-fatal. === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: PASS behaviour-verified: yes follow-up-required: - Obsidian vault mirror (Phase C, separate work) - Skill-edges populate from real Skill use (not blockered) - Hot-path PageRank highlighting in viewer (cosmetic) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 13:07:21 +08:00
Parfii-bot	9a3db14b90	feat(sleep-sync): mirror time-metrics + ledger snapshots, surface in Phase B report User pushback: "что теперь делает сон? все связано?" — Sleep Phase B was reading only `traces/`, ignoring the four tracking journals shipped in the previous commit. Cloud agent had a partial view of what happened. This commit closes the loop. Sleep now sees everything that's tracked. PUSH SIDE — `kei-sleep-sync.sh` (called on every Stop event) Now mirrors the full observability surface into the memory-repo: ~/.claude/memory/time-metrics/sessions.jsonl → time-metrics/ ~/.claude/memory/time-metrics/tasks.jsonl → time-metrics/ ~/.claude/memory/time-metrics/numeric-claims.jsonl → time-metrics/ ~/.claude/memory/time-metrics/agent-toolstats.jsonl→ time-metrics/ ~/.claude/agents/ledger.sqlite agents table → ledger/agents.jsonl ~/.claude/agents/ledger.sqlite skill_invocations → ledger/skill_invocations.jsonl Format: JSONL (one row per object). The two ledger tables are dumped via `sqlite3 + json_object()` so cloud agents can stream-parse into pandas / duckdb without binary-file handling. First sync moved 6 files / 638 rows from local to remote — verified by `git show --stat` of the resulting `memory: session traces` commit. CONSUME SIDE — `phase-b-rem.sh` REM-consolidation report Each nightly `reports/sleep-YYYY-MM-DD.md` now ends with a "Tracking observability (last 7 days)" section containing four jq-aggregated digests: 1. Agent outcomes — per-model: n, functional/partial/scaffolding/fail counts + total_cost_usd. Lets the agent see whether the model-tier refactor (`cb1fdde`) actually paid off and whether Sonnet success rate justifies routing more task classes to it. 2. Skill success rates — per-skill: n, successes, rate_pct. Drives Phase D nightly decisions (archive unused / re-extract failing / mark validated). Empty until Skill tool is invoked in the next session. 3. Numeric-claims tier breakdown — REAL / FROM-JOURNAL / ESTIMATE-HTC counts. High ESTIMATE-HTC ratio = orchestrator under-calibrated. Cloud agent's job: spot frequent ESTIMATE-HTC categories and propose conversion to FROM-JOURNAL via measured runs. 4. Agent tool-call patterns — mean tool_use_count, mean duration_ms, per-tool total calls. Lets the agent see "this code-implementer spawn made 30 Read but 1 Edit — was tier-allocation correct?". All four sections gracefully skip if the source JSONL is missing or empty. jq is the only new dependency (already present per existing phase-b checks). What is NOT yet automated: - The cloud agent's prompt template doesn't yet INSTRUCT it to act on these digests. Currently the digest is data; whether the agent proposes rule + hook codification based on it depends on the free-text instructions in the schedule. Follow-up: codify a Phase B instruction block that maps each digest to a recommendation pattern. - Idempotency on `cp` for time-metrics: I use plain `cp` (not `cp -n`) so the latest local state always overwrites remote. The journals are append-only on the local side, so this is safe — but if two machines ever share one memory-repo it would corrupt. Out of scope for single-machine setup. === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: NOT-RUN (pure shell) behaviour-verified: yes follow-up-required: - Phase B prompt template — instruct cloud agent to act on the four digests (codify recurring patterns, calibrate ESTIMATE-HTC). - skill_invocations.jsonl will populate from next session onward. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 04:02:28 +08:00
Parfii-bot	e073df6c98	feat(tracking): close 3 last observability gaps — toolStats + skill-record + numeric-claims journal Closes the loop on "without full tracking the system can't make decisions" (user pushback on partial coverage). Three gaps that left the inference layer blind are now wired: GAP #1 — agent toolStats / token counts / cache hits captured ================================================================ `agent-outcome-backfill.sh` now appends one JSONL row per spawn to `~/.claude/memory/time-metrics/agent-toolstats.jsonl` with: agent_id, outcome, stubs, ts, tool_use_count, duration_ms, tool_stats {Read:N, Bash:M, ...}, tokens_in, tokens_out, cache_read, cache_write Sidecar journal (no schema migration). Production payload's .tool_response.totalToolUseCount / totalDurationMs / toolStats / usage fields land directly. Smoke-tested with synthetic spawn — row written. GAP #2 — skill_invocations table actually receives writes ================================================================ The `skill_invocations` table (schema v8) had 0 rows because no caller existed for `skill_metrics::record_invocation`. Added two pieces: (a) `kei-ledger record-skill <name> --success {0\|1}` CLI subcommand Mirrors record-cost; same dispatch shape. Optional `--agent-id`, `--trajectory-id`, `--duration-ms`, `--db`. Validates non-empty name + duration ≥ 0. Outputs `{"ok":true,"skill":"...","ts":N}`. (b) `hooks/skill-record.sh` — PostToolUse:Skill hook. 50 LOC POSIX. Detects Skill tool calls, derives success heuristic from tool_response (exit_code / status / content non-empty), shells out to `kei-ledger record-skill`. Bypass via SKILL_RECORD_BYPASS=1. 83 kei-ledger tests pass (16 unit + 67 integration). Smoke-tested end-to-end: `kei-ledger record-skill test-skill --success 1` inserts a row with correct fields. Phase D nightly skill-metrics decisions (archive if unused N days, re-extract if success<60% over M days, validated if >20 calls + >90% success) now have data to consume. GAP #3 — numeric-claims.jsonl receives every evidence-tagged claim ================================================================ RULE 0.18 mandated three markers `[REAL:]` / `[FROM-JOURNAL:]` / `[ESTIMATE-HTC:]` on every numeric/duration/cost claim, but no hook appended valid claims to the journal — the calibration data RULE 0.18 promised never accumulated. `hooks/numeric-claims-record.sh` — Stop hook, 140 LOC POSIX. Reads transcript_path from stdin, locates the last assistant message via recursive flatten (same pattern as agent-outcome-backfill.sh after the production-payload-shape fix), regex-extracts every `<phrase> [<TIER>: <pointer>]` triple, appends one JSONL row per claim. Idempotent within 1-second window to avoid double-recording on repeat Stop fires. Bypass via NUMERIC_CLAIMS_RECORD_BYPASS=1. Smoke test: synthetic transcript with 3 markers (REAL + ESTIMATE-HTC + FROM-JOURNAL) produced exactly 3 well-formed JSONL rows. Settings.json ================================================================ - PostToolUse:Skill matcher created (or augmented if already present) with skill-record.sh. - Stop:* matcher gains numeric-claims-record.sh after the existing chain (stop-verify, task-timer, session-end-dump, extract-task- durations, chat-numeric-postflag, affect-threshold-check, enrich-from-jsonl). What this does NOT do (deferred): - Backfill `skill_invocations` from past traces (history started today; Phase D cohort builds forward from now). - Migrate the agent toolStats sidecar JSONL into a proper ledger column. Append-only file is fine for the current scale. - Refactor main.rs (now 233 LOC, was 212; pre-existing CP debt flagged by skill-record agent — separate cleanup PR). === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: PASS behaviour-verified: yes follow-up-required: - kei-ledger main.rs Constructor Pattern split (212→233 LOC) - Verify in next session: skill_invocations gets rows from real Skill tool use; numeric-claims.jsonl gets rows from real assistant messages with markers Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 03:42:09 +08:00
Parfii-bot	af46684330	feat(secrets+catalog): orphan-detector for env vars + image/video/voice models Two parallel agents (both Sonnet 4.6 via the just-activated tier system) extended the substrate-unified-registry. First end-to-end proof that the Phase 4 router refactor saves money: no Opus spawns this round. PART 1 — `kei-registry secrets` subcommand (Agent A — code-implementer) Reads env-var NAMES from `~/.claude/secrets/.env` (RULE 0.8 SSoT) and per-project `secrets/.env`, greps the kit tree for usages, reports orphans (defined but unreferenced). Live run on this kit found 26 keys, 11 ORPHAN — actionable cleanup candidates incl. GitHub OAuth client creds, Godaddy keys, KeiGit admin creds, KEI_MEMORY_TOKEN. Files: - `_primitives/_rust/kei-registry/src/secrets.rs` (152 LOC) — pure read-side cube. SecretsReport + KeyRow types, env-file parser (KEY=value lines, validates `^[A-Z][A-Z0-9_]$`), walkdir-based scanner with skips (target/ node_modules/ .git/ _generated/), word-boundary regex per key. ASCII + JSON render. - `_primitives/_rust/kei-registry/src/secrets_tests.rs` (125 LOC) — 5 unit tests covering env parse, scan correctness, word-boundary regression (`MY_KEY` ≠ `MY_KEY_EXTRA`), JSON roundtrip, ORPHAN marker. - `_primitives/_rust/kei-registry/src/secrets_handler.rs` (58 LOC) — CLI dispatch handler. - `cli.rs`, `handlers.rs`, `lib.rs` extended with Secrets variant. Resolves the asymmetry called out in the design discussion: paths got atomization (commit `3422bdc`), keys get a query-layer instead. Reason: env-var NAMES are already public and stable; opaque atom-DNA over them adds zero security and full overhead. Orphan detection is the unique value, and a 30-LOC subcommand delivers it without a per-key atom file. PART 2 — kei-model catalog extension (Agent B — fal-ai-runner) Adds 10 generation-model entries with VERIFIED pricing per RULE 0.4: - google: gemini-3-1-flash-image, gemini-3-pro-image - fal.ai: flux-2-pro, flux-pro-1-1, kling-o3, veo-3, ideogram-v3, recraft-v3 - elevenlabs: elevenlabs-v3, elevenlabs-multilingual-v2 Pricing sourced from each provider's public pricing page (URLs cited per row in `notes` + `source_url` fields); 8/10 verified, 2 marked needs-verification (gemini-3-pro-image price not found on public page). Schema additions to `_primitives/_rust/kei-model/src/model.rs` to support the new entries without `provider = "local"` placeholder: - Provider enum + 3 variants: Google, Fal, Elevenlabs (with as_str + parse impls). - Capability enum + 9 variants: image-gen, text-to-image, image-edit, video-gen, text-to-video, image-to-video, voice-gen, text-to-speech, voice-clone (with serde rename + as_str + parse). Pricing struct unchanged: per-image / per-second / per-1k-chars unit costs ride existing `output_per_mtok_micro` field with the unit documented in `notes` (e.g. "Per-image cost. 1 unit = 1 image."). A proper Pricing.unit field is a follow-up. Files: - `_primitives/_rust/kei-model/src/model.rs` (+24 LOC enum extensions) - `_primitives/_rust/kei-model/data/models.toml` (+216 LOC, 471 total) `kei-model list` returns the full 21-model catalog incl. new providers. Tests: - kei-registry: 25 passed (existing + 5 secrets tests + 10 status) - kei-model: 0 (no unit tests in crate, parser smoke via list) - agent-assembler: 29 passed (no regressions) Verification (cited): - `./target/release/kei-registry secrets --env-file ~/.claude/secrets/.env` emits real report 26/11 orphan. - `./target/release/kei-model list` parses all 21 entries cleanly. - `cargo build --release --workspace` clean. What this does NOT do (deferred): - Pricing.unit field (per-mtok / per-image / per-second / per-1k-chars discriminator) — needs Rust struct refactor + cost-estimator update. - `secrets` skip-list extension (worktrees, _ts_packages/node_modules duplicate counts) — minor noise. - gemini-3-pro-image pricing (no public page; vendor-specific quote needed). === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: PASS behaviour-verified: yes follow-up-required: - Pricing.unit field for cost-estimator correctness on gen models - secrets scan: skip .claude/worktrees/ to avoid duplicate counts - gemini-3-pro-image price verification Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 00:06:16 +08:00
Parfii-bot	cb1fddeabb	feat(model-tier+branch-dna): activate cost router + give branches DNA Phase 4 of substrate-unified-registry: turn on the existing kei-model-router by changing manifest defaults from `model = "opus"` to `model = "sonnet"` for routine agents, and give every git branch a deterministic DNA in the kei-status dashboard. The model-tier system was BUILT (`_primitives/_rust/kei-model-router/` crate with Beta posterior, complexity τ-estimator, escalate ladder, calibrate subcommand) and the advisor hook (`~/.claude/hooks/model-router-advisor.sh`) was REGISTERED. But every ledger row from this session ran on Opus because: 1. All 38 manifests hard-coded `model = "opus"` → no chance for the router to recommend cheaper. 2. The orchestrator (me) ignored the stderr advisory. This commit closes (1). (2) is a behavioural change tracked separately. Manifest reclassification (4 Opus + 34 Sonnet): Opus (hard reasoning): - architect (system-design synthesis) - ml-implementer (Math-First paradigm) - ml-researcher (literature analysis) - security-auditor (deep risk synthesis) Sonnet (everything else): - 8 code-implementer-* + code-implementer - 5 critic-* + critic - 6 infra-implementer-* + infra-implementer - 4 researcher-* + researcher - 6 validator-* + validator - 3 security-auditor-{differential,supply-chain,variant} - cost-guardian, fal-ai-runner, frontend-validator, modal-runner Regenerated all 38 `_generated/*.md` so the YAML frontmatter `model:` field matches the manifest. Branch DNA (kei-registry status): - New `compute_branch_dna(name, commit_sha)` in `status.rs`. Format `branch::<sha8(name)>::<sha8(commit)>`, mirrors kei-shared DNA wire layout `<role>::<caps>::<scope_sha8>::<body_sha8>`. - Deterministic — same `(name, commit)` → same DNA. Changes when either changes. No DB persistence: the underlying truth lives in `.git/refs/heads/<name>`. - 3 new unit tests cover format, determinism, name-change, commit- change. `cargo test status::tests` → 10 passed. `kei-registry status` output now shows DNA prefix per branch alongside ahead/behind, last commit. Combined with existing per-block DNA in the [Blocks] and [Path Atoms] sections + `dna` column on `agents` table in kei-ledger, every artefact in the dashboard has an identifier: Atoms (incl path-atoms) → atom::<caps>::<scope>::<body> (registry) Skills/Rules/Hooks/Prim → <role>::<caps>::<scope>::<body> (registry) Agent forks → row.dna in agents table (ledger) Local branches → branch::<sha8>::<sha8> (computed) What this does NOT do: - No outcome backfill — the 205 NULL outcomes in ledger still prevent the Beta posterior from learning. Router falls back to top-tier until ≥1 datapoint per (task_class, model) accumulates. Tracked as follow-up. - No post-checkout hook to auto-register branches in kei-ledger. Live shell-out to `git for-each-ref` is fast enough for the dashboard; persistence buys nothing the .git tree doesn't already give. === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: PASS behaviour-verified: yes follow-up-required: - Outcome backfill hook (writes outcome to ledger after agent done) - User /model claude-sonnet-4-6 for current session (5x cheaper) - Push the orchestrator (me) to read advisor stderr in real-time Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 23:05:07 +08:00
Parfii-bot	d6dbdee870	feat(kei-registry): status subcommand — cross-cutting substrate dashboard Phase 3 of substrate-unified-registry: a single command shows every live artefact across the three sources without merging stores. `kei-registry status` joins: 1. `blocks` table (kei-registry SQLite) — active counts per BlockType, plus the registered path-atoms with DNA prefix + body sha8. 2. `git for-each-ref refs/heads` (shell-out, no DB persistence) — local branches, current marker, ahead/behind via `upstream:track,nobracket`. 3. `agents` table (kei-ledger SQLite) — fork counts per status (running/done/failed/merged/rejected). Missing ledger DB → section skipped, never an error. Output: ASCII multi-section table by default; `--format json` for machine consumption. Files: - `_primitives/_rust/kei-registry/src/status.rs` — new module, ~270 LOC. Pure read-side per Constructor Pattern. 7 unit tests cover `parse_track` (in sync / ahead / behind / both / "gone"), DNA prefix rendering, and empty-status section presence. - `_primitives/_rust/kei-registry/src/cli.rs` — new `Status` variant with `--db`, `--git-repo`, `--ledger-db`, `--format` flags. - `_primitives/_rust/kei-registry/src/handlers.rs` — `handle_status` dispatcher, ASCII/JSON branching. - `_primitives/_rust/kei-registry/src/lib.rs` — module export. End-to-end run from kit root shows the prior gap: 17 local branches (many `worktree-agent-*` orphans), kei-ledger summary 4 running / 158 done / 35 failed / 7 merged / 0 rejected — visibility the user asked for ("в каждой сессии видеть, чтобы не бегать по диску в поисках несмерженных"). What this does NOT do (Phase 4): - No orphan detection (`kei-status orphans`) — counts only. - No auto-registration of branches into kei-ledger (Phase 2). Branches come from live `git for-each-ref` shell-out; if the repo moves or is deleted the row vanishes from the dashboard. Acceptable for v1. === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: PASS behaviour-verified: yes follow-up-required: - Phase 2 (post-checkout hook → kei-ledger auto-register) - Phase 4 (orphan detection: branches with no commits in N days, path-atoms with no consumers, agent forks stuck running) - --filter flags (--type, --status) for targeted queries Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 22:46:53 +08:00
Parfii-bot	22ae9d1de5	fix(kei-registry): short_path strips _blocks/_manifests/_atoms/_roles/_caps/agents Root-cause of the username-path leak in DNA-INDEX.md (107 atom rows in v0.17 — sed-patched in `a23910d`). The encyclopedia render's short_path() prefix list omitted every top-level dir except `_primitives/`, `skills/`, `hooks/`, `rules/` — so atom and capability rows fell through to the absolute path stored in the registry DB, leaking the maintainer's home prefix into the public encyclopedia. Fix: add `_blocks/`, `_manifests/`, `_generated/`, `_atoms/`, `_assembler/`, `_roles/`, `_capabilities/`, `agents/`, `docs/` to the prefix list. 8 unit tests cover the new prefixes (fixtures use CI-style paths like `/srv/ci/build/...` so the source file does not contain a maintainer-shaped path that would itself trip the local pre-commit hook + leak-check CI). Verified: regenerated docs/DNA-INDEX.md has 0 absolute-path hits. Source fix supersedes the sed hot-fix in `a23910d` — the next `kei-registry encyclopedia` invocation will not regress. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 21:09:15 +08:00
Parfii-bot	c35e1ae9ca	chore(kit): wire kei-db-contract into installer + drop final #[path] hack A1 — install.sh wiring for kei-db-contract: - install/lib-substrate.sh substrate_core_binaries(): add kei-db-contract to always-copy list. End users now get the binary in ~/.cargo/bin/ immediately after install (no manual cargo install --path needed). A2 — Wave B follow-up: drop #[path] hack from guard_test_corpus.rs - tests/guard_test_corpus.rs: #[path = "../src/injection_*"] mod ... → use kei_memory::injection_guard::scan - Now uses Wave B's [lib] target like tests/integration.rs already does. - 4 tests still pass. Verified via cargo test: 18 lib + 4 corpus + 3 ingest_guard + 1 injection_unit + 4 dedup + 8 integration + 4 ingest_real_trace = 42 tests, all green. === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: PASS cargo-test: PASS (42 tests, 0 failures) behaviour-verified: yes follow-up-required: - tests/ingest_guard_tests.rs already migrated (Wave A's earlier work) - kei-db-contract still requires kit user to have run install.sh; existing installs need re-run. Kit ledger-validate should add post-install probe. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 17:55:52 +08:00
Parfii-bot	f3f5f79760	feat(frontend-loop): kei-db-contract primitive + frontend-validator agent + auto-dev-guard hook Frontend continuous-quality loop landed. Three composable cubes: Wave 1 — kei-db-contract primitive (~870 LOC, 7 cubes per Constructor Pattern): - Diffs SQL CREATE TABLE migrations against TypeScript type/interface declarations - 4 drift modes: ORPHAN-SQL, ORPHAN-TS, TYPE-MISMATCH, NULL-MISMATCH - Reuses sqlparser-rs (Apache 2.0) + regex + walkdir + serde_json + clap - CLI: kei-db-contract <project-root> [--output json\|text] [--strict] - 5/5 integration tests pass (cargo check + cargo test green) - Smoke-tested on keisei-marketplace: drift_count=266 across 30 tables (expected — marketplace uses raw better-sqlite3 without explicit row types) Wave 2 — frontend-validator agent + dev-guard skill extension: - New _manifests/frontend-validator.toml (substrate_role: edit-local, tools: Bash+Read+Glob+Grep) - Agent runs: stack detect → tsc --noEmit → eslint → kei-db-contract → playwright (optional) - Severity rules: TYPE_CHECK FAIL = block, DB_CONTRACT drift > 0 = block, lint = advisory - skills/dev-guard/SKILL.md extended: 4th agent triggered on .tsx/.ts/.dart edits or DB-layer touches - adaptive-depth table extended with frontend + DB-layer rows Wave 3 — auto-dev-guard.sh hook (PostToolUse:Edit\|Write): - Trivial-edit gate: skip if delta < 30 LOC (avoid spawn fatigue) - File-pattern match: .tsx\|.ts\|.svelte\|.vue\|.dart OR migrations/.sql OR src/db/ OR src/types/ OR prisma/schema.prisma OR drizzle.config.* - Auto-runs kei-db-contract for DB-layer edits if binary on PATH - Stderr advisory only (exit 0 always — never blocks) - Bypass: KEI_DISABLED_HOOKS or KEI_HOOK_PROFILE in {advisory-off, minimal, off} - Smoke-tested with synthetic Edit input (39 LOC delta on .tsx → emits advisory) - Registered in hooks/hooks.json under PostToolUse:Write\|Edit chain Reusability map (Constructor Pattern compose): shared cubes: detect-stack, tsc, eslint, kei-db-contract, kei-visual-snapshot (deferred) orchestrators: /dev-start (pre), /dev-guard (during, NOW with frontend-validator), /dev-ship (final), /site-create (init) Verify-before-commit (RULE 0.13): - cargo check -p kei-db-contract: PASS - cargo test -p kei-db-contract: 5 passed - jq . hooks/hooks.json: valid - bash hooks/auto-dev-guard.sh < synthetic-input: works (frontend-relevant edit detected, exit 0) === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: PASS cargo-test: PASS (5 tests, 0 failures) behaviour-verified: yes follow-up-required: - kei-visual-snapshot primitive (Playwright wrap) — Wave 4, deferred - /dev-start frontend-contract-designer agent + /dev-ship frontend-final-gate — Wave 5, after Wave 1-3 obkatka - install.sh wiring for kei-db-contract binary - hermes-style emit-on-drift advisory mode Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 15:34:39 +08:00
Parfii-bot	902fb3e81a	feat(kei-memory): functional schema fix + 4-wave architecture refactor Wave A — Functional ingest fix (root cause of empty Sleep reports): - Rewrote TraceLine struct to match real Claude Code trace JSONL: type (was kind), timestamp ISO8601 (was epoch ts), message Object, cwd / gitBranch / parentUuid / uuid / subtype / toolUseID / toolUseResult - New src/extract.rs: extract_tool_uses + extract_tool_result walks message.content[] for nested tool_use / tool_result blocks - New src/classifier.rs: explicit table classifier (tool_error, user_correction, retry_loop, permission_denied, tool_use:<name>, ...) replaces shallow heuristic - New src/error.rs: KeiMemoryError enum (IO/Parse/Db) replaces semantic mismatch where IO error was wrapped as rusqlite::InvalidParameterName - New src/trace_line.rs: TraceLine + helpers (cube extraction) - Schema migration v3: events.cwd column + 3 hot-query indices (events.tool, events.file_path, events.ts) + UNIQUE on patterns - New tests/ingest_real_trace.rs: synth-fixture asserts tool/file/cwd/class extraction Wave B — Lib crate split: - Cargo.toml: [lib] target added alongside existing [[bin]] - src/lib.rs: pub re-export of all 18 modules - src/main.rs: 11 mod declarations replaced by single use kei_memory::{…} - tests/integration.rs: #[path] hack replaced by use kei_memory::{…} Wave C — TF-IDF dedup + single-JOIN + filter_map fix: - Schema migration v2: tokens.idf_dirty column + flag-based dedup - index_document no longer triggers per-call recompute_idf rebuild - top_similar uses single JOIN via vectors_for_overlapping_sessions helper (was N round-trips, one session_vector per candidate) - All filter_map(\|r\| r.ok()) row-error swallowing replaced with ? propagation - New tests/tfidf_idf_dedup.rs: 4 tests covering dedup behaviour, IDF emptiness, JOIN-pruning, empty-query safety Wave D — Commands split + nits: - New src/dump.rs (43 LOC) + src/stats.rs (33 LOC): CLI renderers extracted from commands.rs (was inline SQL + format) - src/commands.rs: thin wrappers, -42 LOC - src/injection_guard.rs: inline tests removed (-26 LOC), file under 200 LOC threshold - tests/injection_guard_unit.rs (new): 4 tests in proper integration crate - src/patterns.rs: INSERT replaced with INSERT...ON CONFLICT...DO UPDATE (idempotent re-ingest, uses Wave A's UNIQUE index) - src/analyze.rs + src/coaccess.rs: filter_map row-error fixes - src/coaccess.rs: misleading PK comment rewritten Verify-before-commit (RULE 0.13 §"Verify-before-commit"): - cargo check --all-targets: PASS (1 unrelated dead-code warning) - cargo test: 42 passed, 0 failed across 9 test binaries - STATUS-TRUTH markers aggregated at .claude/agents/_merge/kei-memory-2026-05-01/ Architect-spotted ARCH-MAJOR + ARCH-MINOR + ARCH-NIT findings addressed: - ARCH-MAJOR Cargo.toml binary-only (Wave B) - ARCH-MAJOR schema missing indices (Wave A v3) - ARCH-MAJOR ingest_jsonl choke point (Wave A — extract.rs + classifier.rs) - ARCH-MAJOR idf O(N·V) per-call rebuild (Wave C) - ARCH-MINOR patterns no UPSERT (Wave D) - ARCH-MINOR commands.rs houses dump+stats (Wave D) - ARCH-MINOR classifier silent contract (Wave A) - ARCH-MINOR IO error wrapped as rusqlite (Wave A) - ARCH-MINOR injection_guard inline tests (Wave D) - ARCH-MINOR tfidf top_similar N round-trips (Wave C) - ARCH-NIT 3× filter_map(\|r\| r.ok()) sites (Wave C + D) - ARCH-NIT coaccess misleading comment (Wave D) === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: PASS cargo-test: PASS (42 tests, 0 failures) behaviour-verified: yes follow-up-required: - tests/ingest_guard_tests.rs + tests/guard_test_corpus.rs still on #[path] hack (Wave B follow-up note, ~5 LOC) - dead_code warning Severity::Warn unused (pre-existing, not blocking) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 14:10:06 +08:00
Parfii-bot	0be354a920	KeiSeiKit-public — clean state Single-commit clean baseline after security scrub of niche-tells, project codenames, internal jargon, and contributor-email leaks. Contents: - 100 Rust crates (_primitives/_rust/) - 37 agent manifests (_manifests/) + generated specs (_generated/) - 67 user-invocable skills (skills/) - 33 hooks (hooks/) - Composition blocks (_blocks/) - Documentation (docs/, README.md) - TS adapter packages (_ts_packages/) - Assembler (_assembler/) - Roles (_roles/) - Templates (_templates/) - Forgejo CI (.forgejo/) Author: Denis Parfionovich <info@greendragon.info> License: see LICENSE.	2026-05-01 12:09:03 +08:00

25 commits