Two resource-exhaustion fixes from Opus Rust + Sonnet Rust audits. 1. kei-cortex per_user_locks DashMap unbounded growth (HIGH) File: kei-cortex/src/state.rs Bug: per_user_locks: DashMap<String, Arc<Mutex<()>>> inserted on every distinct user_id; never evicted. Auth'd attacker with 1M unique user_ids could OOM the daemon (~150 bytes/entry = 15GB at 100M entries). Fix: replaced DashMap with tokio::sync::Mutex<LruCache<String, Arc<TokioMutex<()>>>> capped at PER_USER_LOCK_CAP = 1024. Eviction is safe because callers hold their own Arc clone for their critical section; dropping the registry slot retires only the registry's reference. Used tokio::sync::Mutex for the registry because LruCache::get mutates the recency list and requires &mut self. Constructor Pattern: state.rs split into state.rs (184 LOC) + state_factories.rs (64 LOC, new). Tests added: user_lock_evicts_past_cap (registry stays ≤1024 after 2048 inserts), user_lock_keeps_most_recent (LRU recency preserved). Existing user_lock_is_stable_per_user + user_lock_differs_per_user updated to async — sole call site (handlers/portrait.rs) gains .await. 2. kei-runtime stdout/stderr cap was post-hoc (HIGH) File: kei-runtime/src/invoke.rs Bug: wait_with_output() buffered ALL child stdout/stderr; only cap_bytes truncated AFTER the child finished. A malicious atom writing 10 GB stdout (or a buggy one looping infinitely) OOM'd the runtime BEFORE the cap fired. Fix: replaced wait_with_output() with two reader threads sharing KillHandle = Arc<Mutex<Option<Child>>>. Each reader appends bytes up to STREAM_CAP = 16 MiB; on cap exceedance the reader KILLS the child from inside the reader thread (critical — otherwise the unbounded writer would never EOF and a post-hoc kill would never fire). Both readers drain the closing pipe to EOF and return. Truncation surfaces as InvokeError::SubprocessError with explicit "exceeded N byte cap" message. Constructor Pattern: invoke.rs decomposed into invoke.rs (159 LOC) + invoke_io.rs (146 LOC, new) + invoke_error.rs (54 LOC, new). Test added: invoke_kills_runaway_atom — stages a kei-flood script running cat /dev/zero, verifies (a) non-zero exit, (b) stdout < 18 MiB, (c) "cap"/"subprocess" in stderr. cargo check --workspace: clean. cargo test -p kei-cortex -p kei-runtime --test-threads=1: 471 pass / 0 fail. Pre-existing openai_loop_wiring.rs parallel-run flake (state collision when test-threads>1) is unrelated and unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
54 lines
2.2 KiB
Rust
54 lines
2.2 KiB
Rust
//! Typed errors for the atom-invocation runtime.
|
|
//!
|
|
//! Constructor Pattern: extracted from `invoke.rs` so the runtime
|
|
//! parent file stays under 200 LOC after Wave 44d added bounded-read
|
|
//! capture + truncation handling.
|
|
|
|
#[derive(Debug)]
|
|
pub enum InvokeError {
|
|
AtomNotFound(String),
|
|
InputParse(String),
|
|
InputInvalid(String),
|
|
MissingInputSchema(String),
|
|
/// `crate_name` in atom YAML failed the `kei-*` allowlist check.
|
|
InvalidAtom(String),
|
|
/// Crate binary is missing from both `KEI_RUNTIME_BIN_DIR` and `PATH`.
|
|
BinaryNotFound { crate_name: String },
|
|
/// Subprocess exited non-zero — propagate the atom's own exit code.
|
|
AtomFailed { atom: String, code: i32, stderr: String },
|
|
/// IO / spawn failure (not a non-zero exit from the child).
|
|
SubprocessError(String),
|
|
/// Atom's stdout was not parseable as JSON.
|
|
OutputParse(String),
|
|
/// Legacy escape — atom not yet migrated to `run-atom` protocol.
|
|
NotImplemented { atom: String },
|
|
}
|
|
|
|
impl std::fmt::Display for InvokeError {
|
|
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
|
match self {
|
|
Self::AtomNotFound(id) => write!(f, "no atom matching {id}"),
|
|
Self::InputParse(e) => write!(f, "input rejected: {e}"),
|
|
Self::InputInvalid(e) => write!(f, "input rejected: {e}"),
|
|
Self::MissingInputSchema(id) => {
|
|
write!(f, "atom `{id}` declares no input schema")
|
|
}
|
|
Self::InvalidAtom(msg) => write!(f, "invalid atom crate_name: {msg}"),
|
|
Self::BinaryNotFound { crate_name } => write!(
|
|
f,
|
|
"binary `{crate_name}` not found on PATH or KEI_RUNTIME_BIN_DIR"
|
|
),
|
|
Self::AtomFailed { atom, code, stderr } => {
|
|
write!(f, "atom `{atom}` exited {code}: {stderr}")
|
|
}
|
|
Self::SubprocessError(e) => write!(f, "subprocess: {e}"),
|
|
Self::OutputParse(e) => write!(f, "atom stdout not JSON: {e}"),
|
|
Self::NotImplemented { atom } => write!(
|
|
f,
|
|
"invoke not yet wired for this atom ({atom}); use the underlying CLI directly"
|
|
),
|
|
}
|
|
}
|
|
}
|
|
|
|
impl std::error::Error for InvokeError {}
|