Closes the loop on "without full tracking the system can't make decisions" (user pushback on partial coverage). Three gaps that left the inference layer blind are now wired: GAP #1 — agent toolStats / token counts / cache hits captured ================================================================ `agent-outcome-backfill.sh` now appends one JSONL row per spawn to `~/.claude/memory/time-metrics/agent-toolstats.jsonl` with: agent_id, outcome, stubs, ts, tool_use_count, duration_ms, tool_stats {Read:N, Bash:M, ...}, tokens_in, tokens_out, cache_read, cache_write Sidecar journal (no schema migration). Production payload's .tool_response.totalToolUseCount / totalDurationMs / toolStats / usage fields land directly. Smoke-tested with synthetic spawn — row written. GAP #2 — skill_invocations table actually receives writes ================================================================ The `skill_invocations` table (schema v8) had 0 rows because no caller existed for `skill_metrics::record_invocation`. Added two pieces: (a) `kei-ledger record-skill <name> --success {0|1}` CLI subcommand Mirrors record-cost; same dispatch shape. Optional `--agent-id`, `--trajectory-id`, `--duration-ms`, `--db`. Validates non-empty name + duration ≥ 0. Outputs `{"ok":true,"skill":"...","ts":N}`. (b) `hooks/skill-record.sh` — PostToolUse:Skill hook. 50 LOC POSIX. Detects Skill tool calls, derives success heuristic from tool_response (exit_code / status / content non-empty), shells out to `kei-ledger record-skill`. Bypass via SKILL_RECORD_BYPASS=1. 83 kei-ledger tests pass (16 unit + 67 integration). Smoke-tested end-to-end: `kei-ledger record-skill test-skill --success 1` inserts a row with correct fields. Phase D nightly skill-metrics decisions (archive if unused N days, re-extract if success<60% over M days, validated if >20 calls + >90% success) now have data to consume. GAP #3 — numeric-claims.jsonl receives every evidence-tagged claim ================================================================ RULE 0.18 mandated three markers `[REAL:]` / `[FROM-JOURNAL:]` / `[ESTIMATE-HTC:]` on every numeric/duration/cost claim, but no hook appended valid claims to the journal — the calibration data RULE 0.18 promised never accumulated. `hooks/numeric-claims-record.sh` — Stop hook, 140 LOC POSIX. Reads transcript_path from stdin, locates the last assistant message via recursive flatten (same pattern as agent-outcome-backfill.sh after the production-payload-shape fix), regex-extracts every `<phrase> [<TIER>: <pointer>]` triple, appends one JSONL row per claim. Idempotent within 1-second window to avoid double-recording on repeat Stop fires. Bypass via NUMERIC_CLAIMS_RECORD_BYPASS=1. Smoke test: synthetic transcript with 3 markers (REAL + ESTIMATE-HTC + FROM-JOURNAL) produced exactly 3 well-formed JSONL rows. Settings.json ================================================================ - PostToolUse:Skill matcher created (or augmented if already present) with skill-record.sh. - Stop:* matcher gains numeric-claims-record.sh after the existing chain (stop-verify, task-timer, session-end-dump, extract-task- durations, chat-numeric-postflag, affect-threshold-check, enrich-from-jsonl). What this does NOT do (deferred): - Backfill `skill_invocations` from past traces (history started today; Phase D cohort builds forward from now). - Migrate the agent toolStats sidecar JSONL into a proper ledger column. Append-only file is fine for the current scale. - Refactor main.rs (now 233 LOC, was 212; pre-existing CP debt flagged by skill-record agent — separate cleanup PR). === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: PASS behaviour-verified: yes follow-up-required: - kei-ledger main.rs Constructor Pattern split (212→233 LOC) - Verify in next session: skill_invocations gets rows from real Skill tool use; numeric-claims.jsonl gets rows from real assistant messages with markers Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
186 lines
6.1 KiB
Rust
186 lines
6.1 KiB
Rust
//! CLI dispatch helpers — one function per subcommand.
|
|
//!
|
|
//! Constructor Pattern: extracted from `main.rs` to keep the entry point
|
|
//! under the 200-LOC cap. Each fn returns `ExitCode` directly so `main`
|
|
//! stays a flat match.
|
|
//!
|
|
//! Module owner: the binary crate. Pulls library functions from the
|
|
//! `kei_ledger` crate (defined in `src/lib.rs`).
|
|
|
|
use kei_ledger::{cost, descendants, ledger, skill_aggregator_cli, skill_metrics};
|
|
use rusqlite::Connection;
|
|
use serde_json::json;
|
|
use std::path::Path;
|
|
use std::process::ExitCode;
|
|
|
|
pub fn err(msg: &str) -> ExitCode {
|
|
eprintln!("kei-ledger: {msg}");
|
|
ExitCode::from(1)
|
|
}
|
|
|
|
pub fn cmd_list(conn: &Connection, status: Option<&str>) -> ExitCode {
|
|
match ledger::list(conn, status) {
|
|
Ok(rows) => {
|
|
if rows.is_empty() {
|
|
println!("(no agents)");
|
|
}
|
|
for r in &rows {
|
|
println!(
|
|
"{}\t{}\t{}\t{}\tparent={}\tspec={}",
|
|
r.id,
|
|
r.status,
|
|
r.branch,
|
|
r.started_ts,
|
|
r.parent_branch.as_deref().unwrap_or("-"),
|
|
&r.spec_sha[..r.spec_sha.len().min(12)]
|
|
);
|
|
}
|
|
ExitCode::SUCCESS
|
|
}
|
|
Err(e) => err(&format!("list failed: {e}")),
|
|
}
|
|
}
|
|
|
|
pub fn cmd_tree(conn: &Connection, id: &str) -> ExitCode {
|
|
match ledger::tree(conn, id) {
|
|
Ok(rows) if rows.is_empty() => err(&format!("no agent with id {id}")),
|
|
Ok(rows) => {
|
|
for r in &rows {
|
|
let indent = if r.id == id { "" } else { " " };
|
|
println!("{}{} [{}] branch={}", indent, r.id, r.status, r.branch);
|
|
}
|
|
ExitCode::SUCCESS
|
|
}
|
|
Err(e) => err(&format!("tree failed: {e}")),
|
|
}
|
|
}
|
|
|
|
pub fn cmd_validate(branch: &str, repo_root: &Path) -> ExitCode {
|
|
// branch naming convention: agent/<kind>-<ts> OR inline-<ts>
|
|
// ledger artefact dir uses the raw agent id, which the caller passes as branch.
|
|
let agent_id = branch.strip_prefix("agent/").unwrap_or(branch);
|
|
let missing = ledger::validate(repo_root, agent_id);
|
|
if missing.is_empty() {
|
|
println!("OK: all 6 artefacts present for {agent_id}");
|
|
ExitCode::SUCCESS
|
|
} else {
|
|
eprintln!("MISSING for {agent_id}:");
|
|
for m in &missing {
|
|
eprintln!(" - {m}");
|
|
}
|
|
ExitCode::from(2)
|
|
}
|
|
}
|
|
|
|
pub fn cmd_descendants(conn: &Connection, dna: &str) -> ExitCode {
|
|
match descendants::descendants(conn, dna) {
|
|
Ok(rows) => {
|
|
if rows.is_empty() {
|
|
println!("(no descendants for {dna})");
|
|
}
|
|
for r in &rows {
|
|
let relation = if r.fork_parent_id.as_deref() == Some(dna) {
|
|
"fork"
|
|
} else {
|
|
"spawn"
|
|
};
|
|
println!("{}\t{}\t{}\t{}", r.id, relation, r.status, r.branch);
|
|
}
|
|
ExitCode::SUCCESS
|
|
}
|
|
Err(e) => err(&format!("descendants failed: {e}")),
|
|
}
|
|
}
|
|
|
|
/// Record cost metadata for an existing agent. Emits JSON to stdout so
|
|
/// callers (cortex, scripts) can pipe through `jq`. Exit code 1 if the
|
|
/// agent does not exist (zero rows updated), 0 otherwise. Schema must be
|
|
/// at v6+ — `kei-ledger init` migrates legacy ledgers automatically on
|
|
/// open before this dispatcher runs.
|
|
///
|
|
/// Wave 44c: ADDITIVE semantics — repeated calls accumulate cost_cents
|
|
/// for the same agent. Use `--replace` for the legacy overwrite
|
|
/// behavior (typically only retry / amend flows).
|
|
pub fn cmd_record_cost(
|
|
conn: &Connection,
|
|
agent_id: &str,
|
|
cents: u64,
|
|
provider: &str,
|
|
model: &str,
|
|
replace: bool,
|
|
) -> ExitCode {
|
|
let result = if replace {
|
|
cost::replace_cost(conn, agent_id, cents, provider, model)
|
|
} else {
|
|
cost::record_cost(conn, agent_id, cents, provider, model)
|
|
};
|
|
match result {
|
|
Ok(0) => err(&format!("no agent with id {agent_id}")),
|
|
Ok(_) => emit_record_cost_json(conn, agent_id),
|
|
Err(e) => err(&format!("record-cost failed: {e}")),
|
|
}
|
|
}
|
|
|
|
/// Record a skill invocation row in `skill_invocations` (schema v8+).
|
|
/// Validates: skill_name non-empty, duration_ms ≥ 0 if provided.
|
|
/// Emits a one-line JSON `{"ok":true,"skill":"<name>","ts":<unix>}` on success.
|
|
pub fn cmd_record_skill(
|
|
conn: &Connection,
|
|
skill_name: &str,
|
|
success: u8,
|
|
agent_id: Option<String>,
|
|
trajectory_id: Option<String>,
|
|
duration_ms: Option<i64>,
|
|
) -> ExitCode {
|
|
if skill_name.is_empty() {
|
|
return err("skill_name must not be empty");
|
|
}
|
|
if let Some(ms) = duration_ms {
|
|
if ms < 0 {
|
|
return err("duration_ms must be >= 0");
|
|
}
|
|
}
|
|
let ts = chrono::Utc::now().timestamp();
|
|
let inv = skill_metrics::SkillInvocation {
|
|
skill_name: skill_name.to_string(),
|
|
ts,
|
|
agent_id,
|
|
success: success != 0,
|
|
trajectory_id,
|
|
duration_ms,
|
|
};
|
|
match skill_metrics::record_invocation(conn, &inv) {
|
|
Ok(_) => {
|
|
println!("{}", serde_json::json!({"ok": true, "skill": skill_name, "ts": ts}));
|
|
ExitCode::SUCCESS
|
|
}
|
|
Err(e) => err(&format!("record-skill failed: {e}")),
|
|
}
|
|
}
|
|
|
|
/// Thin pass-through so `main.rs` keeps all cmd_* in one import namespace.
|
|
pub fn cmd_aggregate_skills(
|
|
conn: &Connection,
|
|
since: Option<i64>,
|
|
format: &str,
|
|
) -> ExitCode {
|
|
skill_aggregator_cli::cmd_aggregate_skills(conn, since, format)
|
|
}
|
|
|
|
/// Emit the post-write JSON line. Split out to keep `cmd_record_cost`
|
|
/// flat and ≤30 LOC after the `--replace` branch was added.
|
|
fn emit_record_cost_json(conn: &Connection, agent_id: &str) -> ExitCode {
|
|
match cost::read_cost(conn, agent_id) {
|
|
Ok(Some((total, _, _))) => {
|
|
let body = json!({
|
|
"ok": true,
|
|
"agent_id": agent_id,
|
|
"total_cost_cents": total,
|
|
});
|
|
println!("{body}");
|
|
ExitCode::SUCCESS
|
|
}
|
|
Ok(None) => err(&format!("agent {agent_id} disappeared mid-write")),
|
|
Err(e) => err(&format!("read-back failed: {e}")),
|
|
}
|
|
}
|