fix(live-graph): tool_use events properly attribute to spawning agent

User pushback: live-graph showed only "main" node, no pulses on agents.
Root cause: hook stdin doesn't carry parent_tool_use_id for sub-agent
tool calls — we only get the sub-agent's own session_id, which doesn't
link back to the spawn's tool_use_id.

Sequential heuristic via shared state file:
  - agent-event-spawn.sh appends tool_use_id to /tmp/kei-active-children.tsv
  - tool-use-event.sh reads the LAST line of that file → uses that
    tool_use_id as agent_id for the emitted event
  - agent-event-done.sh removes the spawn's line (grep -v + atomic mv)

Verified end-to-end: a code-implementer agent ran 5 Bash calls during
its lifetime — all 5 tool_use events were correctly attributed to the
spawn's tool_use_id. After agent_done, subsequent orchestrator-direct
tool calls correctly fall back to agent_id="main".

Limitation: parallel agents may misattribute. The "most recent live
spawn" heuristic works for single-agent-at-a-time which is the common
case. Parallel spawns share /tmp/kei-active-children.tsv and a sub-
agent's tool calls all attribute to whichever spawn appended last.
Acceptable for v1 demo; proper parent-tool-use-id propagation requires
Claude Code to expose it in sub-agent stdin (upstream change).

The `mv` after `grep -v` runs UNCONDITIONALLY (not gated on grep's
exit code) — grep -v returns 1 when ALL lines match, which would
otherwise leave the stale file in place.

Bypass: `KEI_EVENTS_BYPASS=1` (existing) covers all 3 hooks.
Override path: `KEI_ACTIVE_SPAWNS_FILE=/path/to/file`.

=== STATUS-TRUTH MARKER ===
shipped: functional
stubs: 0
cargo-check: NOT-RUN
behaviour-verified: yes
follow-up-required:
  - Parallel-agent attribution would need parent_tool_use_id from
    Claude Code sub-agent stdin (not currently exposed).
  - Race condition window between spawn append and done remove is
    millisecond-scale; observed clean in single-agent demo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Parfii-bot 2026-05-02 14:43:42 +08:00
parent 52a02dfbff
commit 4e7463ef0a
3 changed files with 41 additions and 11 deletions

View file

@ -56,4 +56,13 @@ jq -cn \
duration_ms:$duration_ms,tool_use_count:$tool_use_count,cost_usd:$cost_usd}' \
>> "$EVENTS_FILE" 2>/dev/null || true
# Remove this spawn from active-children ledger (mirror of spawn hook).
# `grep -v` returns exit 1 when the file becomes empty, so the `mv` runs
# UNCONDITIONALLY (not gated on grep's exit status).
ACTIVE_FILE="${KEI_ACTIVE_SPAWNS_FILE:-/tmp/kei-active-children.tsv}"
if [ -n "$TOOL_USE_ID" ] && [ -f "$ACTIVE_FILE" ]; then
grep -v " $TOOL_USE_ID\$" "$ACTIVE_FILE" > "$ACTIVE_FILE.tmp" 2>/dev/null
mv "$ACTIVE_FILE.tmp" "$ACTIVE_FILE" 2>/dev/null || true
fi
exit 0

View file

@ -2,7 +2,9 @@
# agent-event-spawn.sh — PreToolUse:Agent hook.
#
# Emits `agent_spawn` event to ~/.claude/memory/agent-events.jsonl
# per the locked schema at /tmp/agent-events-schema.md (2026-05-02).
# AND records the tool_use_id in /tmp/kei-active-children.tsv so
# tool-use-event.sh can attribute incoming sub-agent tool calls
# to this spawn (sub-agent stdin lacks parent_tool_use_id).
#
# Defensive: never blocks, exits 0 on every path.
# Bypass via `KEI_EVENTS_BYPASS=1`.
@ -14,7 +16,6 @@ command -v jq >/dev/null 2>&1 || exit 0
PAYLOAD=$(cat 2>/dev/null || true)
[ -n "$PAYLOAD" ] || exit 0
# Self-filter: this hook may be chained for ANY PreToolUse event.
TOOL=$(printf '%s' "$PAYLOAD" | jq -r '.tool_name // empty' 2>/dev/null)
[ "$TOOL" = "Agent" ] || exit 0
@ -23,8 +24,6 @@ mkdir -p "$(dirname "$EVENTS_FILE")" 2>/dev/null || true
TS=$(date -u +%Y-%m-%dT%H:%M:%S.000Z 2>/dev/null)
# Build event in a single jq pass from the raw payload.
# All nullable fields use jq // null so schema types are correct.
printf '%s' "$PAYLOAD" | jq -c \
--arg ts "$TS" \
'{
@ -43,4 +42,15 @@ printf '%s' "$PAYLOAD" | jq -c \
}' \
>> "$EVENTS_FILE" 2>/dev/null || true
# Active-spawn ledger for tool-use attribution. Sub-agent's hook stdin
# carries no parent_tool_use_id, so we maintain a small TSV of currently
# alive spawns; tool-use-event.sh attributes incoming tool_use events to
# the MOST RECENT live spawn (sequential heuristic — works for the common
# single-agent-at-a-time case; parallel agents may misattribute).
TOOL_USE_ID=$(printf '%s' "$PAYLOAD" | jq -r '.tool_use_id // .toolUseId // empty' 2>/dev/null)
ACTIVE_FILE="${KEI_ACTIVE_SPAWNS_FILE:-/tmp/kei-active-children.tsv}"
if [ -n "$TOOL_USE_ID" ]; then
printf '%s\t%s\n' "$(date +%s)" "$TOOL_USE_ID" >> "$ACTIVE_FILE" 2>/dev/null || true
fi
exit 0

View file

@ -1,10 +1,12 @@
#!/bin/sh
# tool-use-event.sh — PreToolUse hook for Bash/Read/Edit/Write/Grep/Glob/NotebookEdit.
#
# Emits `tool_use` event to ~/.claude/memory/agent-events.jsonl
# per the locked schema at /tmp/agent-events-schema.md (2026-05-02).
# Emits `tool_use` event to ~/.claude/memory/agent-events.jsonl.
# Attributes the call to the parent agent via /tmp/kei-active-children.tsv
# (most-recent-spawn heuristic) so the live-graph viewer can pulse the
# correct node.
#
# Agent tools (spawns) are intentionally excluded — handled by agent-event-spawn.sh.
# Agent tools (spawns) are excluded — handled by agent-event-spawn.sh.
# Defensive: never blocks, exits 0 on every path.
# Bypass via `KEI_EVENTS_BYPASS=1`.
set -u
@ -15,7 +17,6 @@ command -v jq >/dev/null 2>&1 || exit 0
PAYLOAD=$(cat 2>/dev/null || true)
[ -n "$PAYLOAD" ] || exit 0
# Self-filter: only emit for the tracked tool set.
TOOL=$(printf '%s' "$PAYLOAD" | jq -r '.tool_name // empty' 2>/dev/null)
case "$TOOL" in
Bash|Read|Edit|Write|Grep|Glob|NotebookEdit) ;;
@ -27,9 +28,19 @@ mkdir -p "$(dirname "$EVENTS_FILE")" 2>/dev/null || true
TOOL_USE_ID=$(printf '%s' "$PAYLOAD" | jq -r '.tool_use_id // .toolUseId // "unknown"' 2>/dev/null)
# Parent agent id: use session_id if present, otherwise "main".
AGENT_ID=$(printf '%s' "$PAYLOAD" | jq -r '.session_id // "main"' 2>/dev/null)
[ -z "$AGENT_ID" ] && AGENT_ID="main"
# Parent agent attribution. Claude Code stdin carries session_id of WHOEVER
# is running (orchestrator OR sub-agent), but does NOT give parent spawn's
# tool_use_id. We consult the active-spawns ledger written by
# agent-event-spawn.sh / removed by agent-event-done.sh:
# - non-empty file → attribute to the MOST RECENT live spawn
# (sequential heuristic — works for single-agent-at-a-time)
# - empty file → fall back to "main" (orchestrator)
ACTIVE_FILE="${KEI_ACTIVE_SPAWNS_FILE:-/tmp/kei-active-children.tsv}"
AGENT_ID="main"
if [ -s "$ACTIVE_FILE" ]; then
LAST_SPAWN=$(tail -1 "$ACTIVE_FILE" 2>/dev/null | awk '{print $2}')
[ -n "$LAST_SPAWN" ] && AGENT_ID="$LAST_SPAWN"
fi
jq -cn \
--arg ts "$(date -u +%Y-%m-%dT%H:%M:%S.000Z 2>/dev/null)" \