KeiSeiKit-1.0/_primitives/_rust/kei-ledger/src/main.rs
Parfii-bot e073df6c98 feat(tracking): close 3 last observability gaps — toolStats + skill-record + numeric-claims journal
Closes the loop on "without full tracking the system can't make decisions"
(user pushback on partial coverage). Three gaps that left the inference
layer blind are now wired:

GAP #1 — agent toolStats / token counts / cache hits captured
================================================================
`agent-outcome-backfill.sh` now appends one JSONL row per spawn to
`~/.claude/memory/time-metrics/agent-toolstats.jsonl` with:
  agent_id, outcome, stubs, ts,
  tool_use_count, duration_ms, tool_stats {Read:N, Bash:M, ...},
  tokens_in, tokens_out, cache_read, cache_write
Sidecar journal (no schema migration). Production payload's
.tool_response.totalToolUseCount / totalDurationMs / toolStats / usage
fields land directly. Smoke-tested with synthetic spawn — row written.

GAP #2 — skill_invocations table actually receives writes
================================================================
The `skill_invocations` table (schema v8) had 0 rows because no caller
existed for `skill_metrics::record_invocation`. Added two pieces:

(a) `kei-ledger record-skill <name> --success {0|1}` CLI subcommand
    Mirrors record-cost; same dispatch shape. Optional `--agent-id`,
    `--trajectory-id`, `--duration-ms`, `--db`. Validates non-empty
    name + duration ≥ 0. Outputs `{"ok":true,"skill":"...","ts":N}`.

(b) `hooks/skill-record.sh` — PostToolUse:Skill hook. 50 LOC POSIX.
    Detects Skill tool calls, derives success heuristic from
    tool_response (exit_code / status / content non-empty), shells
    out to `kei-ledger record-skill`. Bypass via SKILL_RECORD_BYPASS=1.

83 kei-ledger tests pass (16 unit + 67 integration). Smoke-tested
end-to-end: `kei-ledger record-skill test-skill --success 1` inserts
a row with correct fields.

Phase D nightly skill-metrics decisions (archive if unused N days,
re-extract if success<60% over M days, validated if >20 calls + >90%
success) now have data to consume.

GAP #3 — numeric-claims.jsonl receives every evidence-tagged claim
================================================================
RULE 0.18 mandated three markers `[REAL:]` / `[FROM-JOURNAL:]` /
`[ESTIMATE-HTC:]` on every numeric/duration/cost claim, but no hook
appended valid claims to the journal — the calibration data RULE 0.18
promised never accumulated.

`hooks/numeric-claims-record.sh` — Stop hook, 140 LOC POSIX. Reads
transcript_path from stdin, locates the last assistant message via
recursive flatten (same pattern as agent-outcome-backfill.sh after
the production-payload-shape fix), regex-extracts every `<phrase>
[<TIER>: <pointer>]` triple, appends one JSONL row per claim.

Idempotent within 1-second window to avoid double-recording on
repeat Stop fires. Bypass via NUMERIC_CLAIMS_RECORD_BYPASS=1.

Smoke test: synthetic transcript with 3 markers (REAL + ESTIMATE-HTC
+ FROM-JOURNAL) produced exactly 3 well-formed JSONL rows.

Settings.json
================================================================
- PostToolUse:Skill matcher created (or augmented if already
  present) with skill-record.sh.
- Stop:* matcher gains numeric-claims-record.sh after the existing
  chain (stop-verify, task-timer, session-end-dump, extract-task-
  durations, chat-numeric-postflag, affect-threshold-check,
  enrich-from-jsonl).

What this does NOT do (deferred):
  - Backfill `skill_invocations` from past traces (history started
    today; Phase D cohort builds forward from now).
  - Migrate the agent toolStats sidecar JSONL into a proper ledger
    column. Append-only file is fine for the current scale.
  - Refactor main.rs (now 233 LOC, was 212; pre-existing CP debt
    flagged by skill-record agent — separate cleanup PR).

=== STATUS-TRUTH MARKER ===
shipped: functional
stubs: 0
cargo-check: PASS
behaviour-verified: yes
follow-up-required:
  - kei-ledger main.rs Constructor Pattern split (212→233 LOC)
  - Verify in next session: skill_invocations gets rows from real
    Skill tool use; numeric-claims.jsonl gets rows from real assistant
    messages with markers

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 03:42:09 +08:00

233 lines
7.9 KiB
Rust

//! kei-ledger — CLI dispatcher.
//!
//! Single responsibility: parse args, dispatch to ledger ops, format output.
//! Storage: `~/.claude/agents/ledger.sqlite` (or $KEI_LEDGER_DB override).
//!
//! Module tree: this binary depends on the `kei_ledger` library crate
//! (defined in `src/lib.rs`). The CLI dispatcher holds clap shapes and
//! glue only — every operation forwards to a library function.
mod dispatch;
use clap::{Parser, Subcommand};
use dispatch::{
cmd_aggregate_skills, cmd_descendants, cmd_list, cmd_record_cost, cmd_record_skill, cmd_tree,
cmd_validate, err,
};
use kei_ledger::{ledger, schema};
use std::path::PathBuf;
use std::process::ExitCode;
#[derive(Parser)]
#[command(name = "kei-ledger", version, about = "Agent fork/done/fail ledger")]
struct Cli {
/// Override ledger path (default: $KEI_LEDGER_DB or ~/.claude/agents/ledger.sqlite)
#[arg(long)]
db: Option<PathBuf>,
#[command(subcommand)]
cmd: Cmd,
}
#[derive(Subcommand)]
enum Cmd {
/// Create the ledger file + schema if missing.
Init,
/// Log a new running agent.
Fork {
id: String,
/// Branch name (<=256 chars).
#[arg(value_parser = parse_branch)]
branch: String,
/// Parent branch (<=256 chars).
#[arg(long, value_parser = parse_branch)]
parent: Option<String>,
#[arg(long)]
spec_sha: String,
#[arg(long)]
worktree: Option<String>,
/// Layer G DNA fingerprint (optional; kept blank for legacy callers).
#[arg(long)]
dna: Option<String>,
/// DNA / human id of the agent that spawned this fork (v4 lineage).
#[arg(long)]
creator: Option<String>,
/// DNA of the forked-from agent, if this is itself a fork (v4 lineage).
#[arg(long = "fork-parent")]
fork_parent: Option<String>,
},
/// Mark a running agent as done.
Done {
id: String,
#[arg(long)]
summary: String,
},
/// Mark a running agent as failed.
Fail {
id: String,
#[arg(long)]
reason: String,
},
/// Mark a done/failed agent as merged.
Merged { id: String },
/// List agents, optionally filtered by status.
List {
#[arg(long)]
status: Option<String>,
},
/// Print parent -> children tree starting at a root agent id.
Tree { id: String },
/// Validate required artefact bundle for a given branch's agent.
Validate {
branch: String,
#[arg(long, default_value = ".")]
repo_root: PathBuf,
},
/// List agents whose fork_parent_id OR creator_id equals the given DNA.
Descendants { dna: String },
/// Aggregate skill_invocations for Phase D nightly decisions.
AggregateSkills {
/// Unix-second lower bound (default: now - 30 days).
#[arg(long)]
since: Option<i64>,
/// Output format: json or markdown (default: markdown).
#[arg(long, default_value = "markdown")]
format: String,
},
/// Record a skill invocation in `skill_invocations` (schema v8+).
RecordSkill {
/// Skill name as registered in `~/.claude/skills/<name>/SKILL.md`.
skill_name: String,
/// 1 = succeeded, 0 = bailed/failed.
#[arg(long, value_parser = clap::value_parser!(u8).range(0..=1))]
success: u8,
/// Optional agent invocation that triggered this skill.
#[arg(long)]
agent_id: Option<String>,
/// Optional trajectory id for skill-chain grouping.
#[arg(long)]
trajectory_id: Option<String>,
/// Wall-clock duration in milliseconds (≥ 0).
#[arg(long)]
duration_ms: Option<i64>,
},
/// Record cost-tracking metadata (v6+) for an existing agent row.
/// Wave 44c: ADDITIVE by default — repeated calls accumulate. Pass
/// `--replace` for legacy last-write-wins overwrite behavior.
RecordCost {
/// Agent id (matches `fork ... <id>`).
agent_id: String,
/// Cost in cents (integer ≥ 0). Capped at i64::MAX on extreme values.
#[arg(long)]
cents: u64,
/// Provider name, e.g. "anthropic".
#[arg(long)]
provider: String,
/// Model name, e.g. "claude-haiku-4-5-20251001".
#[arg(long)]
model: String,
/// Overwrite previous cost (legacy semantics). Without this flag,
/// the call accumulates with any prior recorded cost on the row.
#[arg(long, default_value_t = false)]
replace: bool,
},
}
/// clap value_parser caps branch/parent length at MAX_BRANCH_LEN (audit L1).
fn parse_branch(s: &str) -> Result<String, String> {
if s.len() > schema::MAX_BRANCH_LEN {
return Err(format!(
"branch length {} exceeds cap {}",
s.len(),
schema::MAX_BRANCH_LEN
));
}
Ok(s.to_string())
}
fn db_path(cli_db: Option<PathBuf>) -> PathBuf {
if let Some(p) = cli_db {
return p;
}
if let Ok(env) = std::env::var("KEI_LEDGER_DB") {
return PathBuf::from(env);
}
let home = std::env::var("HOME").unwrap_or_else(|_| ".".into());
PathBuf::from(home).join(".claude/agents/ledger.sqlite")
}
#[allow(clippy::too_many_arguments)]
fn run_fork(
conn: &rusqlite::Connection,
id: String,
branch: String,
parent: Option<String>,
spec_sha: String,
worktree: Option<String>,
dna: Option<String>,
creator: Option<String>,
fork_parent: Option<String>,
) -> ExitCode {
match ledger::fork(
conn,
&id,
&branch,
parent.as_deref(),
&spec_sha,
worktree.as_deref(),
dna.as_deref(),
creator.as_deref(),
fork_parent.as_deref(),
) {
Ok(()) => {
println!("forked {id} -> {branch}");
ExitCode::SUCCESS
}
Err(e) => err(&format!("fork failed: {e}")),
}
}
fn main() -> ExitCode {
let cli = Cli::parse();
let path = db_path(cli.db);
let conn = match ledger::open(&path) {
Ok(c) => c,
Err(e) => return err(&format!("open {}: {e}", path.display())),
};
match cli.cmd {
Cmd::Init => {
println!("initialised {}", path.display());
ExitCode::SUCCESS
}
Cmd::Fork { id, branch, parent, spec_sha, worktree, dna, creator, fork_parent } => {
run_fork(&conn, id, branch, parent, spec_sha, worktree, dna, creator, fork_parent)
}
Cmd::Done { id, summary } => match ledger::done(&conn, &id, &summary) {
Ok(0) => err(&format!("no running agent with id {id}")),
Ok(_) => ExitCode::SUCCESS,
Err(e) => err(&format!("done failed: {e}")),
},
Cmd::Fail { id, reason } => match ledger::fail(&conn, &id, &reason) {
Ok(0) => err(&format!("no running agent with id {id}")),
Ok(_) => ExitCode::SUCCESS,
Err(e) => err(&format!("fail update failed: {e}")),
},
Cmd::Merged { id } => match ledger::merged(&conn, &id) {
Ok(0) => err(&format!("no done/failed agent with id {id}")),
Ok(_) => ExitCode::SUCCESS,
Err(e) => err(&format!("merged failed: {e}")),
},
Cmd::List { status } => cmd_list(&conn, status.as_deref()),
Cmd::Tree { id } => cmd_tree(&conn, &id),
Cmd::Validate { branch, repo_root } => cmd_validate(&branch, &repo_root),
Cmd::Descendants { dna } => cmd_descendants(&conn, &dna),
Cmd::AggregateSkills { since, format } => {
cmd_aggregate_skills(&conn, since, &format)
}
Cmd::RecordSkill { skill_name, success, agent_id, trajectory_id, duration_ms } => {
cmd_record_skill(&conn, &skill_name, success, agent_id, trajectory_id, duration_ms)
}
Cmd::RecordCost { agent_id, cents, provider, model, replace } => {
cmd_record_cost(&conn, &agent_id, cents, &provider, &model, replace)
}
}
}