Closes the loop on "without full tracking the system can't make decisions" (user pushback on partial coverage). Three gaps that left the inference layer blind are now wired: GAP #1 — agent toolStats / token counts / cache hits captured ================================================================ `agent-outcome-backfill.sh` now appends one JSONL row per spawn to `~/.claude/memory/time-metrics/agent-toolstats.jsonl` with: agent_id, outcome, stubs, ts, tool_use_count, duration_ms, tool_stats {Read:N, Bash:M, ...}, tokens_in, tokens_out, cache_read, cache_write Sidecar journal (no schema migration). Production payload's .tool_response.totalToolUseCount / totalDurationMs / toolStats / usage fields land directly. Smoke-tested with synthetic spawn — row written. GAP #2 — skill_invocations table actually receives writes ================================================================ The `skill_invocations` table (schema v8) had 0 rows because no caller existed for `skill_metrics::record_invocation`. Added two pieces: (a) `kei-ledger record-skill <name> --success {0|1}` CLI subcommand Mirrors record-cost; same dispatch shape. Optional `--agent-id`, `--trajectory-id`, `--duration-ms`, `--db`. Validates non-empty name + duration ≥ 0. Outputs `{"ok":true,"skill":"...","ts":N}`. (b) `hooks/skill-record.sh` — PostToolUse:Skill hook. 50 LOC POSIX. Detects Skill tool calls, derives success heuristic from tool_response (exit_code / status / content non-empty), shells out to `kei-ledger record-skill`. Bypass via SKILL_RECORD_BYPASS=1. 83 kei-ledger tests pass (16 unit + 67 integration). Smoke-tested end-to-end: `kei-ledger record-skill test-skill --success 1` inserts a row with correct fields. Phase D nightly skill-metrics decisions (archive if unused N days, re-extract if success<60% over M days, validated if >20 calls + >90% success) now have data to consume. GAP #3 — numeric-claims.jsonl receives every evidence-tagged claim ================================================================ RULE 0.18 mandated three markers `[REAL:]` / `[FROM-JOURNAL:]` / `[ESTIMATE-HTC:]` on every numeric/duration/cost claim, but no hook appended valid claims to the journal — the calibration data RULE 0.18 promised never accumulated. `hooks/numeric-claims-record.sh` — Stop hook, 140 LOC POSIX. Reads transcript_path from stdin, locates the last assistant message via recursive flatten (same pattern as agent-outcome-backfill.sh after the production-payload-shape fix), regex-extracts every `<phrase> [<TIER>: <pointer>]` triple, appends one JSONL row per claim. Idempotent within 1-second window to avoid double-recording on repeat Stop fires. Bypass via NUMERIC_CLAIMS_RECORD_BYPASS=1. Smoke test: synthetic transcript with 3 markers (REAL + ESTIMATE-HTC + FROM-JOURNAL) produced exactly 3 well-formed JSONL rows. Settings.json ================================================================ - PostToolUse:Skill matcher created (or augmented if already present) with skill-record.sh. - Stop:* matcher gains numeric-claims-record.sh after the existing chain (stop-verify, task-timer, session-end-dump, extract-task- durations, chat-numeric-postflag, affect-threshold-check, enrich-from-jsonl). What this does NOT do (deferred): - Backfill `skill_invocations` from past traces (history started today; Phase D cohort builds forward from now). - Migrate the agent toolStats sidecar JSONL into a proper ledger column. Append-only file is fine for the current scale. - Refactor main.rs (now 233 LOC, was 212; pre-existing CP debt flagged by skill-record agent — separate cleanup PR). === STATUS-TRUTH MARKER === shipped: functional stubs: 0 cargo-check: PASS behaviour-verified: yes follow-up-required: - kei-ledger main.rs Constructor Pattern split (212→233 LOC) - Verify in next session: skill_invocations gets rows from real Skill tool use; numeric-claims.jsonl gets rows from real assistant messages with markers Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
233 lines
7.9 KiB
Rust
233 lines
7.9 KiB
Rust
//! kei-ledger — CLI dispatcher.
|
|
//!
|
|
//! Single responsibility: parse args, dispatch to ledger ops, format output.
|
|
//! Storage: `~/.claude/agents/ledger.sqlite` (or $KEI_LEDGER_DB override).
|
|
//!
|
|
//! Module tree: this binary depends on the `kei_ledger` library crate
|
|
//! (defined in `src/lib.rs`). The CLI dispatcher holds clap shapes and
|
|
//! glue only — every operation forwards to a library function.
|
|
|
|
mod dispatch;
|
|
|
|
use clap::{Parser, Subcommand};
|
|
use dispatch::{
|
|
cmd_aggregate_skills, cmd_descendants, cmd_list, cmd_record_cost, cmd_record_skill, cmd_tree,
|
|
cmd_validate, err,
|
|
};
|
|
use kei_ledger::{ledger, schema};
|
|
use std::path::PathBuf;
|
|
use std::process::ExitCode;
|
|
|
|
#[derive(Parser)]
|
|
#[command(name = "kei-ledger", version, about = "Agent fork/done/fail ledger")]
|
|
struct Cli {
|
|
/// Override ledger path (default: $KEI_LEDGER_DB or ~/.claude/agents/ledger.sqlite)
|
|
#[arg(long)]
|
|
db: Option<PathBuf>,
|
|
#[command(subcommand)]
|
|
cmd: Cmd,
|
|
}
|
|
|
|
#[derive(Subcommand)]
|
|
enum Cmd {
|
|
/// Create the ledger file + schema if missing.
|
|
Init,
|
|
/// Log a new running agent.
|
|
Fork {
|
|
id: String,
|
|
/// Branch name (<=256 chars).
|
|
#[arg(value_parser = parse_branch)]
|
|
branch: String,
|
|
/// Parent branch (<=256 chars).
|
|
#[arg(long, value_parser = parse_branch)]
|
|
parent: Option<String>,
|
|
#[arg(long)]
|
|
spec_sha: String,
|
|
#[arg(long)]
|
|
worktree: Option<String>,
|
|
/// Layer G DNA fingerprint (optional; kept blank for legacy callers).
|
|
#[arg(long)]
|
|
dna: Option<String>,
|
|
/// DNA / human id of the agent that spawned this fork (v4 lineage).
|
|
#[arg(long)]
|
|
creator: Option<String>,
|
|
/// DNA of the forked-from agent, if this is itself a fork (v4 lineage).
|
|
#[arg(long = "fork-parent")]
|
|
fork_parent: Option<String>,
|
|
},
|
|
/// Mark a running agent as done.
|
|
Done {
|
|
id: String,
|
|
#[arg(long)]
|
|
summary: String,
|
|
},
|
|
/// Mark a running agent as failed.
|
|
Fail {
|
|
id: String,
|
|
#[arg(long)]
|
|
reason: String,
|
|
},
|
|
/// Mark a done/failed agent as merged.
|
|
Merged { id: String },
|
|
/// List agents, optionally filtered by status.
|
|
List {
|
|
#[arg(long)]
|
|
status: Option<String>,
|
|
},
|
|
/// Print parent -> children tree starting at a root agent id.
|
|
Tree { id: String },
|
|
/// Validate required artefact bundle for a given branch's agent.
|
|
Validate {
|
|
branch: String,
|
|
#[arg(long, default_value = ".")]
|
|
repo_root: PathBuf,
|
|
},
|
|
/// List agents whose fork_parent_id OR creator_id equals the given DNA.
|
|
Descendants { dna: String },
|
|
/// Aggregate skill_invocations for Phase D nightly decisions.
|
|
AggregateSkills {
|
|
/// Unix-second lower bound (default: now - 30 days).
|
|
#[arg(long)]
|
|
since: Option<i64>,
|
|
/// Output format: json or markdown (default: markdown).
|
|
#[arg(long, default_value = "markdown")]
|
|
format: String,
|
|
},
|
|
/// Record a skill invocation in `skill_invocations` (schema v8+).
|
|
RecordSkill {
|
|
/// Skill name as registered in `~/.claude/skills/<name>/SKILL.md`.
|
|
skill_name: String,
|
|
/// 1 = succeeded, 0 = bailed/failed.
|
|
#[arg(long, value_parser = clap::value_parser!(u8).range(0..=1))]
|
|
success: u8,
|
|
/// Optional agent invocation that triggered this skill.
|
|
#[arg(long)]
|
|
agent_id: Option<String>,
|
|
/// Optional trajectory id for skill-chain grouping.
|
|
#[arg(long)]
|
|
trajectory_id: Option<String>,
|
|
/// Wall-clock duration in milliseconds (≥ 0).
|
|
#[arg(long)]
|
|
duration_ms: Option<i64>,
|
|
},
|
|
/// Record cost-tracking metadata (v6+) for an existing agent row.
|
|
/// Wave 44c: ADDITIVE by default — repeated calls accumulate. Pass
|
|
/// `--replace` for legacy last-write-wins overwrite behavior.
|
|
RecordCost {
|
|
/// Agent id (matches `fork ... <id>`).
|
|
agent_id: String,
|
|
/// Cost in cents (integer ≥ 0). Capped at i64::MAX on extreme values.
|
|
#[arg(long)]
|
|
cents: u64,
|
|
/// Provider name, e.g. "anthropic".
|
|
#[arg(long)]
|
|
provider: String,
|
|
/// Model name, e.g. "claude-haiku-4-5-20251001".
|
|
#[arg(long)]
|
|
model: String,
|
|
/// Overwrite previous cost (legacy semantics). Without this flag,
|
|
/// the call accumulates with any prior recorded cost on the row.
|
|
#[arg(long, default_value_t = false)]
|
|
replace: bool,
|
|
},
|
|
}
|
|
|
|
/// clap value_parser caps branch/parent length at MAX_BRANCH_LEN (audit L1).
|
|
fn parse_branch(s: &str) -> Result<String, String> {
|
|
if s.len() > schema::MAX_BRANCH_LEN {
|
|
return Err(format!(
|
|
"branch length {} exceeds cap {}",
|
|
s.len(),
|
|
schema::MAX_BRANCH_LEN
|
|
));
|
|
}
|
|
Ok(s.to_string())
|
|
}
|
|
|
|
fn db_path(cli_db: Option<PathBuf>) -> PathBuf {
|
|
if let Some(p) = cli_db {
|
|
return p;
|
|
}
|
|
if let Ok(env) = std::env::var("KEI_LEDGER_DB") {
|
|
return PathBuf::from(env);
|
|
}
|
|
let home = std::env::var("HOME").unwrap_or_else(|_| ".".into());
|
|
PathBuf::from(home).join(".claude/agents/ledger.sqlite")
|
|
}
|
|
|
|
#[allow(clippy::too_many_arguments)]
|
|
fn run_fork(
|
|
conn: &rusqlite::Connection,
|
|
id: String,
|
|
branch: String,
|
|
parent: Option<String>,
|
|
spec_sha: String,
|
|
worktree: Option<String>,
|
|
dna: Option<String>,
|
|
creator: Option<String>,
|
|
fork_parent: Option<String>,
|
|
) -> ExitCode {
|
|
match ledger::fork(
|
|
conn,
|
|
&id,
|
|
&branch,
|
|
parent.as_deref(),
|
|
&spec_sha,
|
|
worktree.as_deref(),
|
|
dna.as_deref(),
|
|
creator.as_deref(),
|
|
fork_parent.as_deref(),
|
|
) {
|
|
Ok(()) => {
|
|
println!("forked {id} -> {branch}");
|
|
ExitCode::SUCCESS
|
|
}
|
|
Err(e) => err(&format!("fork failed: {e}")),
|
|
}
|
|
}
|
|
|
|
fn main() -> ExitCode {
|
|
let cli = Cli::parse();
|
|
let path = db_path(cli.db);
|
|
let conn = match ledger::open(&path) {
|
|
Ok(c) => c,
|
|
Err(e) => return err(&format!("open {}: {e}", path.display())),
|
|
};
|
|
match cli.cmd {
|
|
Cmd::Init => {
|
|
println!("initialised {}", path.display());
|
|
ExitCode::SUCCESS
|
|
}
|
|
Cmd::Fork { id, branch, parent, spec_sha, worktree, dna, creator, fork_parent } => {
|
|
run_fork(&conn, id, branch, parent, spec_sha, worktree, dna, creator, fork_parent)
|
|
}
|
|
Cmd::Done { id, summary } => match ledger::done(&conn, &id, &summary) {
|
|
Ok(0) => err(&format!("no running agent with id {id}")),
|
|
Ok(_) => ExitCode::SUCCESS,
|
|
Err(e) => err(&format!("done failed: {e}")),
|
|
},
|
|
Cmd::Fail { id, reason } => match ledger::fail(&conn, &id, &reason) {
|
|
Ok(0) => err(&format!("no running agent with id {id}")),
|
|
Ok(_) => ExitCode::SUCCESS,
|
|
Err(e) => err(&format!("fail update failed: {e}")),
|
|
},
|
|
Cmd::Merged { id } => match ledger::merged(&conn, &id) {
|
|
Ok(0) => err(&format!("no done/failed agent with id {id}")),
|
|
Ok(_) => ExitCode::SUCCESS,
|
|
Err(e) => err(&format!("merged failed: {e}")),
|
|
},
|
|
Cmd::List { status } => cmd_list(&conn, status.as_deref()),
|
|
Cmd::Tree { id } => cmd_tree(&conn, &id),
|
|
Cmd::Validate { branch, repo_root } => cmd_validate(&branch, &repo_root),
|
|
Cmd::Descendants { dna } => cmd_descendants(&conn, &dna),
|
|
Cmd::AggregateSkills { since, format } => {
|
|
cmd_aggregate_skills(&conn, since, &format)
|
|
}
|
|
Cmd::RecordSkill { skill_name, success, agent_id, trajectory_id, duration_ms } => {
|
|
cmd_record_skill(&conn, &skill_name, success, agent_id, trajectory_id, duration_ms)
|
|
}
|
|
Cmd::RecordCost { agent_id, cents, provider, model, replace } => {
|
|
cmd_record_cost(&conn, &agent_id, cents, &provider, &model, replace)
|
|
}
|
|
}
|
|
}
|