Commit graph

4 commits

Author SHA1 Message Date
Parfii-bot
4e99057d2b fix(perf): bound per-user lock LRU + stream-cap atom subprocess output
Two resource-exhaustion fixes from Opus Rust + Sonnet Rust audits.

1. kei-cortex per_user_locks DashMap unbounded growth (HIGH)
   File: kei-cortex/src/state.rs
   Bug: per_user_locks: DashMap<String, Arc<Mutex<()>>> inserted on every
   distinct user_id; never evicted. Auth'd attacker with 1M unique user_ids
   could OOM the daemon (~150 bytes/entry = 15GB at 100M entries).

   Fix: replaced DashMap with tokio::sync::Mutex<LruCache<String,
   Arc<TokioMutex<()>>>> capped at PER_USER_LOCK_CAP = 1024. Eviction is
   safe because callers hold their own Arc clone for their critical section;
   dropping the registry slot retires only the registry's reference. Used
   tokio::sync::Mutex for the registry because LruCache::get mutates the
   recency list and requires &mut self.

   Constructor Pattern: state.rs split into state.rs (184 LOC) +
   state_factories.rs (64 LOC, new). Tests added: user_lock_evicts_past_cap
   (registry stays ≤1024 after 2048 inserts), user_lock_keeps_most_recent
   (LRU recency preserved). Existing user_lock_is_stable_per_user +
   user_lock_differs_per_user updated to async — sole call site
   (handlers/portrait.rs) gains .await.

2. kei-runtime stdout/stderr cap was post-hoc (HIGH)
   File: kei-runtime/src/invoke.rs
   Bug: wait_with_output() buffered ALL child stdout/stderr; only cap_bytes
   truncated AFTER the child finished. A malicious atom writing 10 GB stdout
   (or a buggy one looping infinitely) OOM'd the runtime BEFORE the cap fired.

   Fix: replaced wait_with_output() with two reader threads sharing
   KillHandle = Arc<Mutex<Option<Child>>>. Each reader appends bytes up to
   STREAM_CAP = 16 MiB; on cap exceedance the reader KILLS the child from
   inside the reader thread (critical — otherwise the unbounded writer would
   never EOF and a post-hoc kill would never fire). Both readers drain the
   closing pipe to EOF and return. Truncation surfaces as
   InvokeError::SubprocessError with explicit "exceeded N byte cap" message.

   Constructor Pattern: invoke.rs decomposed into invoke.rs (159 LOC) +
   invoke_io.rs (146 LOC, new) + invoke_error.rs (54 LOC, new). Test added:
   invoke_kills_runaway_atom — stages a kei-flood script running cat
   /dev/zero, verifies (a) non-zero exit, (b) stdout < 18 MiB, (c)
   "cap"/"subprocess" in stderr.

cargo check --workspace: clean. cargo test -p kei-cortex -p kei-runtime
--test-threads=1: 471 pass / 0 fail. Pre-existing openai_loop_wiring.rs
parallel-run flake (state collision when test-threads>1) is unrelated and
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 15:39:50 +08:00
Parfii-bot
7cc544fd85 chore: author email + Cargo metadata SSoT (parfionovich@keilab.io)
Two related changes:

1. Author email update across the kit
   - All `info@greendragon.info` references replaced with `parfionovich@keilab.io`
   - Touched: NOTICE, README.md, _ts_packages/package.json (and 5 adapter packages),
     plus 90+ Cargo.toml files
   - Apache-2.0 attribution unchanged (Denis Parfionovich, 2026)

2. Cargo workspace.package SSoT for author/license/repository/homepage
   - Added to [workspace.package]:
     authors    = ["Denis Parfionovich <parfionovich@keilab.io>"]
     license    = "Apache-2.0"
     repository = "https://github.com/KeiSei84/KeiSeiKit-1.0"
     homepage   = "https://github.com/KeiSei84/KeiSeiKit-1.0"
   - All ~89 member crates migrated from inline declarations to:
     authors.workspace    = true
     license.workspace    = true
     (repository/homepage where applicable)
   - Closes audit gap: kei-graph-stream, kei-cortex, kei-shared previously had no
     license field at the crate level, blocking `cargo publish` on those.
     Now they inherit Apache-2.0 from workspace.
   - kei-scheduler/Cargo.toml: removed stray duplicate `authors` line introduced
     by an earlier migration sweep.

cargo check --workspace: clean. No code changes; metadata-only migration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 13:55:28 +08:00
Parfii-bot
cb1090bef3 fix(security): RCE allowlist + WebSocket auth + SSH option-injection
Group D — three independent security primitives hardening (post-audit 2026-05-02).

kei-runtime — atom invoke RCE allowlist:
- invoke.rs: is_safe_crate_name validator (regex ^kei-[a-z][a-z0-9-]+$);
             rejects /, \\, .., :, absolute paths, empty, >128 chars.
             InvalidAtom error variant.
             stdout/stderr capped at 16 MiB (was unbounded).
- main.rs: InvalidAtom mapped to exit code 2.
- tests/invoke_exit_codes_smoke.rs: invoke_unsafe_crate_name_exits_2 added.
- Closes: any user able to write atoms/*.md with crate_name: "rm" or "sudo"
           triggered arbitrary command execution.

kei-graph-stream — WebSocket bearer + Origin:
- auth.rs (new, 142 LOC): token load + bearer extraction + Origin allowlist +
                            ConstantTimeEq compare; 8 unit tests.
- ws.rs: ws_handler validates Origin + bearer before upgrade (403/401 on failure).
- main.rs: --public-bind-i-accept-the-leak flag required for non-loopback bind;
            else bail!() with explicit error.
- tests/smoke.rs: rewritten with Origin + bearer headers via connect_async_with_config.
- Closes: WebSocket /stream had zero auth, zero Origin check; browser CSWSH could
           subscribe to agent activity broadcast; KEI_GRAPH_STREAM_BIND env silently
           accepted any SocketAddr.

kei-compute-baremetal — SSH option injection (CVE-2023-51385 class):
- ssh.rs: is_safe_user + is_safe_host validators (alphanumeric + -_.; reject leading -;
           max 64 chars; no @, :, /, \\, space).
- ssh.rs: -- sentinel before user@host argv (OpenSSH 9.6+ stops flag parsing).
- ssh.rs: StrictHostKeyChecking=yes default; KEI_BAREMETAL_ACCEPT_NEW=1 for TOFU.
- error.rs: InvalidRegion variant.
- provider.rs: validators applied in target_for_spec + target_for_handle.
- Closes: spec.region "-oProxyCommand=evil" triggered local RCE before TCP connect.

Test results: 29 passed; 0 failed across all three crates. cargo check clean.

Findings: RCE allowlist (Wave-A) + WebSocket auth (Wave-B) + SSH injection (Wave-B)
were unique-per-retest discoveries. None present in original wave-1 audit.

Note: kei-compute-baremetal/src/provider.rs at 300 LOC (was 268; +32 from validators).
Pre-existing >200 LOC violation, fix scope was security-additions only. Follow-up:
split provider.rs into provider.rs (<200) + provider_tests.rs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 21:40:24 +08:00
Parfii-bot
0be354a920 KeiSeiKit-public — clean state
Single-commit clean baseline after security scrub of niche-tells,
project codenames, internal jargon, and contributor-email leaks.

Contents:
- 100 Rust crates (_primitives/_rust/)
- 37 agent manifests (_manifests/) + generated specs (_generated/)
- 67 user-invocable skills (skills/)
- 33 hooks (hooks/)
- Composition blocks (_blocks/)
- Documentation (docs/, README.md)
- TS adapter packages (_ts_packages/)
- Assembler (_assembler/)
- Roles (_roles/)
- Templates (_templates/)
- Forgejo CI (.forgejo/)

Author: Denis Parfionovich <info@greendragon.info>

License: see LICENSE.
2026-05-01 12:09:03 +08:00