docs(agent-substrate): v1 schema DRAFT — capability triplet + role + task spec + runtime contract

Sibling SSoT to SUBSTRATE-SCHEMA.md (atom substrate). This one decomposes agent invocations rather than code primitives. Core contribution — the capability TRIPLET, not just text: - text.md — what agent reads (prompt fragment) - gate.sh — PreToolUse hook (runtime enforcement) - verify.sh — on-return predicate run from main repo (not worktree) Motivation from substrate v1 orchestration audit: - 40% prompt boilerplate across 7 spawns (git-ban + constructor-pattern + report format etc. copy-pasted each time) - Self-reported green tests broke at integration (E1 jsonschema regression — agent claimed PASS from worktree but main workspace failed; caught only by integration test) - Scope violations (E1 touched invoke.rs when E3 was supposed to own it; surfaced only at merge) Triplet closes all three gaps: capabilities aren't promises agents make in prose, they're enforced by gate hooks pre-exec and verified by predicates on return from main branch clean state. Schema specifies: - Capability atom layout: _capabilities/<category>/<slug>/ - capability.toml frontmatter shape - text.md / gate.sh / verify.sh contracts - Role = bundle of capabilities (5 roles: read-only, explorer, edit-local, edit-shared, git-ops) - task.toml shape (orchestrator-written per spawn; parameterizes roles) - kei-agent-runtime crate contract: compose + spawn + verify + run - Initial 10-capability inventory for phase 1 - 6-question decision log with defaults - 5-phase parallel build plan (phases 1-4 parallel, ~5-7 days wall time) Open questions flagged at bottom for review before AGENT-SCHEMA-LOCKED.md. Once locked: sibling SSoTs (atoms + agents) evolve symmetrically — agents compose atoms, atoms compose agents (ultimate goal). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 01:39:23 +08:00 · 2026-04-23 01:39:23 +08:00 · 372bbc8320
commit 372bbc8320
parent 59fd1a0362
1 changed files with 424 additions and 0 deletions
--- a/docs/AGENT-SUBSTRATE-SCHEMA.md
+++ b/docs/AGENT-SUBSTRATE-SCHEMA.md
@ -0,0 +1,424 @@
+# KeiSeiKit Agent Substrate Schema v1
+
+**STATUS:** Draft — under review. Once approved and `AGENT-SCHEMA-LOCKED.md` is committed, this document is **LOCKED** for 3 weeks of parallel phase work.
+
+**PURPOSE:** Sibling SSoT to `SUBSTRATE-SCHEMA.md`. That one decomposes code primitives (atoms). This one decomposes **agent invocations** (capabilities).
+
+**Motivation from substrate v1 orchestration pain:** across 7 agent spawns in audit+follow-up waves, the same friction recurred — 40% prompt boilerplate, self-reported green tests that broke at integration, scope violations surfacing only after merge. Fix: capabilities become **enforced triplets**, not suggestions in freetext prompts.
+
+---
+
+## Core concept: capability atom = triplet
+
+An **agent capability** is not a reusable text block. It is a **three-artifact bundle** that gives every restriction meaning across three layers:
+
+| Artifact | What | Who consumes |
+|---|---|---|
+| `text.md` | The fragment the agent reads in its composed prompt | Agent (via LLM context) |
+| `gate.sh` | PreToolUse hook fragment, runs before every tool call | Claude Code harness |
+| `verify.sh` | Return-time predicate, runs from main repo on composed result | Orchestrator (kei-agent-runtime) |
+
+**The invariant:** if any of the three fails, the capability did not hold. Self-reported compliance is not trusted — verification runs from main branch after agent return, against a clean workspace, not from the agent's worktree where its own changes are hidden state.
+
+---
+
+## File layout
+
+```
+_capabilities/
+├── policy/
+│   ├── no-git-ops/
+│   │   ├── capability.toml          — metadata (name, category, tools, patterns)
+│   │   ├── text.md                  — prompt fragment
+│   │   ├── gate.sh                  — PreToolUse hook
+│   │   └── verify.sh                — on-return check
+│   └── …
+├── scope/
+│   ├── files-whitelist/             — parameterized by task.scope.files-whitelist
+│   └── files-denylist/
+├── quality/
+│   ├── constructor-pattern/
+│   ├── cargo-check-green/
+│   └── tests-green/
+├── safety/
+│   └── no-dep-bump/
+├── output/
+│   ├── report-format/
+│   └── severity-grade/
+└── tools/
+    ├── read-only/
+    └── cargo-only-bash/
+
+_roles/
+├── read-only.toml                   — bundle of capabilities + tool allowlist
+├── explorer.toml
+├── edit-local.toml                  — the code-implementer role
+├── edit-shared.toml
+└── git-ops.toml                     — documented but NOT spawnable (orchestrator only)
+
+_primitives/_rust/kei-agent-runtime/  — new crate, phase 3
+├── Cargo.toml
+├── src/
+│   ├── lib.rs
+│   ├── compose.rs                   — task.toml + role + capabilities → prompt.md
+│   ├── spawn.rs                     — Agent tool call with composed prompt
+│   ├── gate.rs                      — install hooks parameterized by task scope
+│   └── verify.rs                    — run all capability verify.sh predicates
+└── tests/
+
+tasks/                                — ephemeral, gitignored
+└── (generated per spawn: <agent-id>/task.toml + prompt.md)
+
+docs/AGENT-SUBSTRATE-SCHEMA.md       — this file
+docs/AGENT-ROLES.md                  — human-readable role matrix (generated from _roles/*.toml)
+docs/AGENT-SCHEMA-LOCKED.md          — lock marker
+```
+
+---
+
+## Capability atom — `capability.toml` shape
+
+```toml
+[capability]
+name = "policy::no-git-ops"           # <category>::<slug> namespace
+category = "policy"                   # policy | scope | quality | safety | output | tools
+version = "1.0"
+description = "RULE 0.13 — orchestrator owns git, agent writes files only"
+rationale = "See ~/.claude/rules/orchestrator-branch-first.md"
+
+[restricts]
+# What this capability forbids. Runtime gate enforces.
+tool-patterns = [                     # matched against tool_input.command
+  '^git( |$)',
+  '^gh (repo|api /repos)',
+]
+tools-denied = []                     # e.g. ["Edit", "Write"] for read-only
+
+[parameterized]
+# Is this capability instance-configurable per task?
+accepts = []                          # e.g. ["files-whitelist"] for scope/* caps
+
+[text]
+path = "text.md"                      # relative to capability dir
+
+[gate]
+path = "gate.sh"
+event = "PreToolUse:Bash"             # PreToolUse:Bash | PreToolUse:Edit|Write | PreToolUse:Agent
+severity = "block"                    # block (exit 2) | warn (exit 0 + stderr) | advisory (log only)
+bypass-env = "ORCHESTRATOR_META"      # optional env var to disable
+
+[verify]
+path = "verify.sh"
+run-from = "main-worktree"            # main-worktree | agent-worktree | both
+when = "on-return"                    # on-return | per-tool-call
+```
+
+---
+
+## Capability `text.md` conventions
+
+- Imperative, second-person, short.
+- ≤ 200 words per fragment.
+- No overlap — if two capabilities say the same thing, extract into a shared one.
+- Fragment stands alone — composer concatenates multiple fragments with `\n\n---\n\n` separator; fragments must not reference each other.
+- Lead with the rule ("You MUST NOT X"), follow with the why ("because Y").
+
+Example (`_capabilities/policy/no-git-ops/text.md`):
+
+```markdown
+## No git operations
+
+You MUST NOT invoke `git`, `gh repo`, `gh api /repos`, or any shell
+command that modifies git state. Orchestrator handles all git operations
+(commits, branches, pushes, rebases).
+
+If your task requires staging a change, describe it in the return
+file-list — the orchestrator will commit on your behalf.
+
+Bypass exists for orchestrator-meta agents only; it is not available here.
+```
+
+---
+
+## Capability `gate.sh` contract
+
+Shell script invoked by Claude Code as a PreToolUse hook. Receives JSON on stdin (`tool_name`, `tool_input`, plus task context injected by orchestrator).
+
+Exit codes:
+- `0` — allow tool call to proceed
+- `2` — block tool call (permissionDecision: deny)
+- any other — treated as infrastructure error, warn to stderr, allow (fail-open per existing hook convention)
+
+Canonical shape (see `_capabilities/policy/no-git-ops/gate.sh` as reference impl):
+
+```bash
+#!/usr/bin/env bash
+set -eu
+CMD=$(jq -r '.tool_input.command // empty' 2>/dev/null || true)
+[ -z "$CMD" ] && exit 0
+
+# Bypass for orchestrator-meta agents
+[ "${ORCHESTRATOR_META:-0}" = "1" ] && exit 0
+
+# Load restrictions from capability.toml at runtime
+PATTERNS=$(capability_patterns policy::no-git-ops tool-patterns)
+for pat in $PATTERNS; do
+  if echo "$CMD" | grep -qE "$pat"; then
+    cat <<JSON
+{"hookSpecificOutput":{"hookEventName":"PreToolUse","permissionDecision":"deny","permissionDecisionReason":"🚫 capability policy::no-git-ops — command matches $pat"}}
+JSON
+    exit 2
+  fi
+done
+exit 0
+```
+
+`capability_patterns` is a small helper shipped with the runtime (reads the capability.toml and emits the list).
+
+---
+
+## Capability `verify.sh` contract
+
+Shell script invoked by orchestrator after agent return. Receives env:
+- `$AGENT_ID` — the agent's internal ID
+- `$TASK_TOML` — path to the task spec the agent was given
+- `$WORKTREE_PATH` — path to the agent's worktree
+- `$MAIN_REPO` — path to the orchestrator's main repo
+- plus any capability-specific env from `[parameterized]`
+
+Exit codes:
+- `0` — predicate holds, capability verified
+- non-zero — violation. stderr MUST contain a one-line human-readable description (≤ 200 chars).
+
+Example (`_capabilities/quality/cargo-check-green/verify.sh`):
+
+```bash
+#!/usr/bin/env bash
+set -eu
+cd "$MAIN_REPO/_primitives/_rust"
+if ! cargo check --workspace >/tmp/agent-$AGENT_ID-check.log 2>&1; then
+  echo "cargo check --workspace FAILED — agent claims regression not present in main workspace" >&2
+  tail -5 /tmp/agent-$AGENT_ID-check.log >&2
+  exit 1
+fi
+exit 0
+```
+
+This runs from `$MAIN_REPO` — not from the agent's worktree — because agent-local passing ≠ main-integration passing (the lesson from E1's jsonschema regression in audit wave).
+
+---
+
+## Role — `_roles/<name>.toml` shape
+
+```toml
+[role]
+name = "edit-local"
+display-name = "code-implementer (local edit scope)"
+description = "Write code + run cargo check/test + emit report. No git, no workspace touches."
+
+[capabilities]
+# Ordered list — text.md fragments concatenated in this order
+required = [
+  "policy::no-git-ops",
+  "scope::files-whitelist",
+  "scope::files-denylist",
+  "quality::constructor-pattern",
+  "quality::cargo-check-green",
+  "quality::tests-green",
+  "safety::no-dep-bump",
+  "output::report-format",
+]
+
+[tools]
+# Tool allowlist — anything not in this list is denied
+allowed = ["Read", "Write", "Edit", "Glob", "Grep", "Bash"]
+# Bash further restricted by quality/tools atoms
+bash-patterns-allowed = ['^cargo( |$)', '^mkdir( |$)', '^rm -rf /tmp/']
+
+[escalation]
+policy = "ask-via-return"              # ask-via-return | orchestrator-notify | fail-fast
+```
+
+---
+
+## Task spec — `task.toml` shape (orchestrator writes per spawn)
+
+```toml
+[task]
+role = "edit-local"
+agent-id = "abc123…"                  # allocated by kei-ledger fork
+parent-agent = null                    # or parent ID for nested
+
+[scope]
+# Parameterizes scope::files-whitelist + scope::files-denylist
+files-whitelist = [
+  "_primitives/_rust/kei-forge/**",
+]
+files-denylist = [
+  "_primitives/_rust/Cargo.toml",
+  "_primitives/_rust/Cargo.lock",
+  "scripts/**",
+  ".github/**",
+]
+
+[verification]
+# Parameterizes quality/* caps
+cargo-check-crates = ["kei-forge"]
+cargo-test-crates = ["kei-forge"]
+test-count-min = 44
+
+[output]
+# Parameterizes output/report-format
+report-fields-required = ["files-touched", "cargo-check", "cargo-test", "loc-delta"]
+
+[body]
+# Free-text task instructions, concatenated AFTER role capability fragments
+text = """
+Replace shell-out with pure-Rust templating. …
+"""
+```
+
+---
+
+## Runtime execution contract
+
+`kei-agent-runtime` crate provides:
+
+```bash
+# Compose prompt from task spec
+kei-agent-runtime compose <task.toml>
+# → writes <task-dir>/prompt.md
+
+# Spawn agent with composed prompt + install gates + record ledger
+kei-agent-runtime spawn <task.toml>
+# → returns agent-id; background-task notification semantics
+
+# Run all capability verify predicates against agent's return
+kei-agent-runtime verify <task.toml> <worktree-path>
+# → exit 0 if all hold, non-zero with report of violations
+
+# One-shot helper: compose + spawn + verify
+kei-agent-runtime run <task.toml>
+```
+
+Execution flow:
+
+```
+1. orchestrator writes task.toml
+2. `kei-agent-runtime compose` → prompt.md
+3. `kei-agent-runtime spawn` →
+     a. kei-ledger fork <agent-id>
+     b. install PreToolUse gates parameterized by task.scope
+     c. Agent tool call with isolation=worktree + composed prompt
+4. [agent executes]
+5. `kei-agent-runtime verify` →
+     a. run each capability verify.sh from MAIN repo (not worktree)
+     b. collect all violations
+     c. exit 0 if empty, non-zero with report
+6. orchestrator decides: merge | reject + respawn | reject + rollback
+```
+
+---
+
+## Initial capability atom inventory (phase 1 builds these 10)
+
+| Name | Category | text / gate / verify | Core restriction |
+|---|---|---|---|
+| `policy::no-git-ops` | policy | ✓/✓/✓ | Block `git`, `gh repo`, `gh api /repos` |
+| `scope::files-whitelist` | scope | ✓/✓/✓ | PreToolUse:Edit\|Write denies paths outside whitelist; on-return git diff check |
+| `scope::files-denylist` | scope | ✓/✓/✓ | PreToolUse:Edit\|Write denies paths in denylist (overrides whitelist) |
+| `quality::constructor-pattern` | quality | ✓/—/✓ | On return: no file > 200 LOC, no fn > 30 LOC |
+| `quality::cargo-check-green` | quality | ✓/—/✓ | On return: `cargo check --workspace` from MAIN passes |
+| `quality::tests-green` | quality | ✓/—/✓ | On return: `cargo test -p <crate>` passes, count ≥ task min |
+| `safety::no-dep-bump` | safety | ✓/✓/✓ | PreToolUse:Edit on Cargo.toml denies unless task opts in; on-return lock-diff check |
+| `output::report-format` | output | ✓/—/✓ | On return: parse report, assert required fields present |
+| `tools::read-only` | tools | ✓/✓/— | PreToolUse denies Edit/Write entirely |
+| `tools::cargo-only-bash` | tools | ✓/✓/— | PreToolUse:Bash denies unless command matches allowlist pattern |
+
+---
+
+## Initial role inventory (phase 2 builds these 5)
+
+| Role | Capabilities | Tools |
+|---|---|---|
+| `read-only` | tools::read-only + output::report-format + output::severity-grade | Read / Glob / Grep / WebFetch |
+| `explorer` | read-only caps + tools::cargo-only-bash (for `cargo check`) | + Bash-cargo |
+| `edit-local` | policy::no-git-ops + scope::* + quality::* + safety::no-dep-bump + output::report-format | + Edit / Write / Bash-cargo |
+| `edit-shared` | edit-local caps + permission for specified SSoT patterns | Same + SSoT paths |
+| `git-ops` | Documented-only, NOT spawnable (orchestrator holds this) | All |
+
+---
+
+## Decision log — choose defaults before lock
+
+| # | Question | Proposed default | Rationale |
+|---|---|---|---|
+| 1 | 1 file per capability (bundled) OR 3 separate files (text/gate/verify)? | **3 separate** | Each artifact has different consumer; separate files = clean diff, independent edits |
+| 2 | Gate language: Bash or Rust? | **Bash for phase 1** | Integrates with existing hook system; Rust upgrade path exists via `kei-agent-runtime` compiling hook drivers |
+| 3 | Verify language: Bash or Rust? | **Bash for phase 1** | Same reason. Predicates are usually simple (run a cmd, check exit). Complex ones can shell-out to Rust helpers. |
+| 4 | Task spec format: TOML or YAML? | **TOML** | Consistent with Cargo.toml + `_primitives/MANIFEST.toml` + locked atom schema |
+| 5 | Capability ID separator: `::` or `/`? | **`::`** | Consistent with atom IDs. `policy::no-git-ops` reads Rust-native. |
+| 6 | Gated file path: `_capabilities/<category>/<slug>/` or flat `_capabilities/<cat>-<slug>/`? | **Nested** | Scales to 50+ capabilities, category browsability |
+| 7 | Text fragment max length | **200 words per capability** | Agent context budget; cap forces atomicity |
+| 8 | Verify runs from | **main-worktree by default, per-capability override** | Catches the E1-jsonschema-regression class of bug |
+
+---
+
+## Phase plan (post-lock, parallel)
+
+| Phase | What | Depends on | Agent | Estimate |
+|---|---|---|---|---|
+| 0 | This schema + lock | — | me | 0.5 day |
+| 1 | Capability atom library — 10 × (text.md, gate.sh, verify.sh, capability.toml) = 40 files | phase 0 | 1 code-implementer | 2 days |
+| 2 | Role matrix — 5 TOML + `docs/AGENT-ROLES.md` | phase 0 | 1 code-implementer | 0.5 day |
+| 3 | `kei-agent-runtime` crate — compose + spawn + verify + CLI | phase 0 | 1 code-implementer | 3-4 days |
+| 4 | Hook wiring — `~/.claude/hooks/agent-capability-gate.sh` meta-hook that reads task spec and invokes capability gates | phases 1 + 3 | 1 code-implementer | 1 day |
+| 5 | Migration — 5 custom agents (code-implementer / critic / architect / security-auditor / validator) move from hand-rolled prompts to role+task invocation | phases 1+2+3+4 | 1 code-implementer | 1 day |
+
+**Phases 1, 2, 3 can start immediately after lock** (different dirs, zero overlap).
+Phase 4 depends on 1+3.
+Phase 5 depends on everything.
+
+Total wall-time with parallel: ~5-7 days from lock.
+
+---
+
+## Integration with substrate v1
+
+This schema is **additive** to locked `SUBSTRATE-SCHEMA.md`. The two SSoTs sit side by side:
+
+- `SUBSTRATE-SCHEMA.md` — how code decomposes into atoms (locked 2026-04-22)
+- `AGENT-SUBSTRATE-SCHEMA.md` — how agent invocation decomposes into capabilities (this doc)
+
+Cross-ref: agent capability `quality::cargo-check-green` verifies that atoms compiled; atom agents produced via `kei-forge` can themselves be invoked through `kei-runtime` (atom substrate) OR composed into role definitions (agent substrate).
+
+Eventually (post-both-locks): **agents compose atoms, atoms compose agents**. Symmetric substrates.
+
+---
+
+## Lock declaration
+
+Once this document is approved by the user and `docs/AGENT-SCHEMA-LOCKED.md` is committed, the capability-triplet shape + role shape + task-spec shape + runtime contract are **immutable for 3 weeks** (shorter lock than atom substrate because agent substrate is greenfield, expected revisions).
+
+Breaking changes during lock require:
+1. Explicit revocation by user
+2. All parallel phase agents paused
+3. Lock marker amended with revocation reason
+4. `kei-ledger` row: bypass reason + revocation timestamp
+
+Non-breaking additions (new capability atoms beyond the initial 10, new roles, new parameterized fields on existing capabilities) are allowed during lock.
+
+---
+
+## Open questions for review
+
+Before we lock, call out things that might be wrong:
+
+1. **3-file triplet vs bundled .md** — is splitting text/gate/verify into 3 files the right granularity? Con: more files; pro: independent versioning + diff.
+2. **Gate language Bash** — ok for phase 1, or do you want Rust from day 1? Bash is quick but loses type safety. Rust phase-1 adds ~2 days to phase 3.
+3. **`main-worktree` verify default** — catches integration regressions, but means verify can't start until orchestrator has a clean main. Alt: `both` (run in worktree first for speed, then in main for correctness).
+4. **Task spec ephemerality** — `tasks/` gitignored. Should we persist them for archaeology? Ledger already tracks agent-id → spec-sha, so the spec is recoverable from `kei-sage` or `git log` if we do persist.
+5. **Capability atoms I didn't include in the initial 10** — should any be added? Candidates: `safety::no-mass-delete`, `output::ledger-row-required`, `quality::no-warnings`, `scope::no-rule-edits`.
+6. **Role `git-ops` documented but not spawnable** — do we document it in role TOML or in a rule file?
+
+Defaults resolved → lock → spawn phases 1-4 in parallel.