Hub-and-spoke orchestrator for RULE 0.12 at project scale. SKILL.md
indexes 4 phase files: intake, fork-skeleton, parallel-exec, merge-
ceremony.
Flow:
Phase 1 — 1 free-text line (GOAL) + 1 batch of 5 AskUserQuestion
(type, theory, fanout, main-agent, DB mode).
Phase 2 — project/<slug> branch, kei-ledger fork root, theoretical
sub-agent spawn confirmation click.
Phase 3 — poll kei-ledger list --status running, aggregate
progress.json, steering click (continue / add / kill /
merge / pause).
Phase 4 — kei-ledger validate per bundle, per-branch merge verdict
click (merge --no-ff / squash / reject / defer), final
integration + NO-DOWNGRADE close click if any rejected /
deferred.
>=6 AskUserQuestion calls minimum (1 batch Phase 1 + 1 Phase 2 + 1
Phase 3 + >=2 per-branch Phase 4 + 1 close).
Constructor Pattern: SKILL.md 109 LOC, phase files 80-108 LOC each —
all under 150 LOC.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2.8 KiB
2.8 KiB
Phase 3 — Parallel Execution
Poll the ledger, aggregate progress, let the user steer the fleet mid-run.
3a — Aggregate current state
Every ~60s (or whenever the user asks "status"):
kei-ledger list --status running
kei-ledger list --status done
kei-ledger list --status failed
For each row, attempt to read
.claude/agents/<id>/progress.json (written by the child every ≥ 30s per
feedback_agent_observability). Shape:
{
"pct": 0..100,
"last_step": "short free text",
"wall_time_s": int,
"last_updated": iso8601
}
Aggregate into PROGRESS = {id: {status, pct, last_summary, stale_s}}.
stale_s = now - last_updated. Flag stale_s > 300 as "possibly hung".
Render one-line-per-agent table:
<id> <status> <pct>% <last_step> stale=<s>s
3b — Steering click (AskUserQuestion, ONE)
After displaying the table:
{
"questions": [
{
"question": "Fleet state — how to proceed?",
"header": "Steer",
"multiSelect": false,
"options": [
{"label": "continue polling", "description": "Wait and re-poll — default if > 0 agents are running"},
{"label": "add sub-agent", "description": "Spawn another sub-agent — returns to Phase 2c"},
{"label": "kill stale", "description": "Call kei-ledger fail on agents with stale_s > 300"},
{"label": "proceed to merge", "description": "All required agents done — jump to Phase 4"},
{"label": "pause and review", "description": "Stop polling — user reviews manually, re-enter later"}
]
}
]
}
Route:
continue polling→ re-run 3a after a user-initiated "status" request (do NOT busy-loop on a timer — let the user drive polling cadence)add sub-agent→ jump to Phase 2c, add new row, return to Phase 3kill stale→ for eachstale_s > 300running row:kei-ledger fail <id> --reason "stale > 300s, killed by orchestrator"proceed to merge→ Phase 4 (verify norunningrows remain first)pause and review→ emit final report with current state, user re-enters the skill later (ledger survives)
Verify-criterion
- Ledger-polling code actually called
kei-ledger listat least once per Phase-3 entry (unlessDB_MODE == "file-only"— in which case the orchestrator reads eachprogress.jsondirectly). - Every running sub-agent either has a fresh
progress.json(stale_s < 300) or has been flagged as "possibly hung" in the displayed table. - If
proceed to mergeis clicked whilekei-ledger list --status runningis non-empty — block the transition and re-ask with the list shown. - NO-DOWNGRADE: if every child has failed, DO NOT close the project silently. Emit a 3-path recovery click (retry failed / re-fanout / abort with audit).