User pushback: "транслирует в онлайне какие агенты создаются? основное
окно агента, а дальше при запусках появляются новые ветки, мы показываем
в онлайне как агенты собираются и работают"
Earlier `kei-graph-export` rendered the static SUBSTRATE (all 581 atoms,
catalog-style). User wanted the LIFECYCLE: orchestrator at center, every
new agent as a fading-in branch, every tool call as a pulse, every
completion as a fade-out. TTL = until done; pure online, no history
accumulation per user direction.
Three-layer architecture, all conforming to schema /tmp/agent-events-schema.md:
LAYER 1 — Event emitters (4 hooks)
hooks/agent-event-spawn.sh PreToolUse:Agent → agent_spawn event
hooks/agent-event-done.sh PostToolUse:Agent → agent_done event
(parses STATUS-TRUTH MARKER for outcome,
computes cost_usd from token×pricing table)
hooks/tool-use-event.sh PreToolUse:Bash|Read|Edit|Write|Grep|Glob|NotebookEdit
→ tool_use event
hooks/skill-record.sh EXTENDED — second emit step writes skill_use
event in addition to existing kei-ledger
record-skill call
All 4 are POSIX /bin/sh, defensive (never block, exit 0), bypass via
KEI_EVENTS_BYPASS=1. Append-only JSONL to
~/.claude/memory/agent-events.jsonl.
Smoke: 4 synthetic invocations cover spawn/done/tool/filter cases.
LAYER 2 — kei-graph-stream Rust daemon
_primitives/_rust/kei-graph-stream/ (~480 LOC, 5 files + 1 test)
- Tails events.jsonl every 200ms (poll-based, no notify dep).
- Parses each event, updates AliveState (insert on spawn, remove on done).
- Broadcasts {"type":"event","data":<event>} to all WebSocket clients.
- On client connect: sends {"type":"snapshot","alive":[...]} first.
- Heartbeat: {"type":"ping"} every 30s.
- axum 0.7 + ws feature (already in Cargo.lock via kei-cortex).
- Bypass: KEI_GRAPH_STREAM_BYPASS=1.
Bound to 127.0.0.1:8201 (loopback only). Endpoints:
GET /stream → WebSocket upgrade
GET /health → "kei-graph-stream alive"
4 unit + 1 integration test. cargo build clean.
Installed binary: ~/.cargo/bin/kei-graph-stream
Launchd plist: io.keisei.graph-stream (RunAtLoad, KeepAlive)
Loaded as PID 52678, /health 200 OK verified.
LAYER 3 — live-graph.html (single-file frontend)
~/Projects/lbm-graph-viz/live-graph.html (~464 LOC, self-contained)
- SVG full-viewport, dark #0f172a, CSS grid background.
- Pinned center node "main" (orchestrator), gold #fbbf24, glowing.
- Agents radiate via D3 force-simulation; color-by-model
(sonnet=green, opus=red, haiku=blue, default=gray).
- On agent_spawn: fade-in 300ms, edge from main to new node.
- On tool_use: pulse on agent node (r 8→12→8 over 400ms) +
floating tool name label fades 800ms.
- On agent_done: outcome-color flash → fade-out 800ms → remove.
- WebSocket client: ws://127.0.0.1:8201/stream, exponential-backoff
reconnect (1s→30s).
- Top-right status badge: ● connected | ○ reconnecting | ✕ disconnected.
- Bottom counters: alive / spawned / tool calls / done / last event age.
- No build step. D3 v7 from CDN. Pure HTML+JS+CSS.
End-to-end smoke (this machine, just now):
- daemon health 200 OK
- hook injected agent_spawn → daemon broadcasts → AliveState=1
- hook injected agent_done → daemon broadcasts → AliveState=0
- frontend file syntax-checked clean
What this does NOT do (deferred, by user direction "это онлайн"):
- History persistence — agents who finished are GONE from the graph.
Per-session log remains in events.jsonl + sleep-sync if user wants
to consult later, but the live view is RIGHT NOW only.
- Sub-agent attribution beyond "main" — orchestrator-direct tool calls
show on the orchestrator node. Sub-agent's internal tool calls would
need session-id correlation; current schema has agent_id="main"
placeholder for non-Agent tool calls.
- Replay mode — no time-scrubber. Possible follow-up if useful.
- Auth on WebSocket — bound to 127.0.0.1 only. Local-only by design.
=== STATUS-TRUTH MARKER ===
shipped: functional
stubs: 0
cargo-check: PASS
behaviour-verified: yes
follow-up-required:
- Sub-agent tool-call attribution (correlate session_id chain)
- Replay mode with time scrubber (if user finds use)
- Tool aggregator nodes ("Bash bucket" with N) instead of per-agent pulses
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
68 lines
2.8 KiB
Bash
Executable file
68 lines
2.8 KiB
Bash
Executable file
#!/bin/sh
|
|
# skill-record.sh — PostToolUse:Skill hook.
|
|
# Records every skill invocation to kei-ledger for Phase D nightly metrics.
|
|
# Also emits skill_use event to agent-events.jsonl (schema 2026-05-02).
|
|
# Defensive: never blocks, exits 0 on every path.
|
|
set -u
|
|
|
|
[ "${SKILL_RECORD_BYPASS:-0}" = "1" ] && exit 0
|
|
command -v jq >/dev/null 2>&1 || exit 0
|
|
|
|
PAYLOAD=$(cat 2>/dev/null || true)
|
|
[ -n "$PAYLOAD" ] || exit 0
|
|
|
|
# Only fire for Skill tool calls — Claude Code may chain hooks for any tool.
|
|
TOOL=$(printf '%s' "$PAYLOAD" | jq -r '.tool_name // empty' 2>/dev/null)
|
|
[ "$TOOL" = "Skill" ] || exit 0
|
|
|
|
SKILL=$(printf '%s' "$PAYLOAD" | jq -r '.tool_input.skill // .tool_input.skillName // empty' 2>/dev/null)
|
|
[ -n "$SKILL" ] || exit 0
|
|
|
|
# Success heuristic: prefer explicit exit_code, then status string, then
|
|
# non-empty content array, then string response non-empty. Default 0.
|
|
SUCCESS=$(printf '%s' "$PAYLOAD" | jq -r '
|
|
if (.tool_response // empty | type) == "object" then
|
|
if (.tool_response.exit_code // 1) == 0 then 1
|
|
elif (.tool_response.status // "") | test("ok|completed|done"; "i") then 1
|
|
elif (.tool_response.content // [] | length) > 0 then 1
|
|
else 0 end
|
|
elif (.tool_response // empty | type) == "string" then
|
|
if .tool_response == "" then 0 else 1 end
|
|
elif (.tool_response // empty | type) == "array" then
|
|
if (.tool_response | length) > 0 then 1 else 0 end
|
|
else 0 end
|
|
' 2>/dev/null)
|
|
[ -n "$SUCCESS" ] || SUCCESS=0
|
|
|
|
DURATION=$(printf '%s' "$PAYLOAD" | jq -r '
|
|
.duration_ms // .tool_response.totalDurationMs // empty
|
|
' 2>/dev/null)
|
|
|
|
AGENT_ID=$(printf '%s' "$PAYLOAD" | jq -r '.tool_use_id // empty' 2>/dev/null)
|
|
|
|
# kei-ledger record (optional — skip gracefully if not installed).
|
|
if command -v kei-ledger >/dev/null 2>&1; then
|
|
ARGS="$SKILL --success $SUCCESS"
|
|
[ -n "$AGENT_ID" ] && ARGS="$ARGS --agent-id $AGENT_ID"
|
|
[ -n "$DURATION" ] && ARGS="$ARGS --duration-ms $DURATION"
|
|
# shellcheck disable=SC2086
|
|
kei-ledger record-skill $ARGS >/dev/null 2>&1 || true
|
|
fi
|
|
|
|
# Emit skill_use event to agent-events.jsonl (schema 2026-05-02).
|
|
if [ "${KEI_EVENTS_BYPASS:-0}" != "1" ]; then
|
|
EVENTS_FILE="$HOME/.claude/memory/agent-events.jsonl"
|
|
mkdir -p "$(dirname "$EVENTS_FILE")" 2>/dev/null || true
|
|
SESSION_ID_SKILL=$(printf '%s' "$PAYLOAD" | jq -r '.session_id // "main"' 2>/dev/null)
|
|
[ -z "$SESSION_ID_SKILL" ] && SESSION_ID_SKILL="main"
|
|
jq -cn \
|
|
--arg ts "$(date -u +%Y-%m-%dT%H:%M:%S.000Z 2>/dev/null)" \
|
|
--arg id "${AGENT_ID:-unknown}" \
|
|
--arg agent_id "$SESSION_ID_SKILL" \
|
|
--arg skill "$SKILL" \
|
|
--argjson success "${SUCCESS:-0}" \
|
|
'{ts:$ts,event:"skill_use",id:$id,agent_id:$agent_id,skill:$skill,success:($success==1)}' \
|
|
>> "$EVENTS_FILE" 2>/dev/null || true
|
|
fi
|
|
|
|
exit 0
|