From 372bbc83200947e924185d676d18764245ec9f85 Mon Sep 17 00:00:00 2001 From: Parfii-bot Date: Thu, 23 Apr 2026 01:39:23 +0800 Subject: [PATCH 1/2] =?UTF-8?q?docs(agent-substrate):=20v1=20schema=20DRAF?= =?UTF-8?q?T=20=E2=80=94=20capability=20triplet=20+=20role=20+=20task=20sp?= =?UTF-8?q?ec=20+=20runtime=20contract?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sibling SSoT to SUBSTRATE-SCHEMA.md (atom substrate). This one decomposes agent invocations rather than code primitives. Core contribution — the capability TRIPLET, not just text: - text.md — what agent reads (prompt fragment) - gate.sh — PreToolUse hook (runtime enforcement) - verify.sh — on-return predicate run from main repo (not worktree) Motivation from substrate v1 orchestration audit: - 40% prompt boilerplate across 7 spawns (git-ban + constructor-pattern + report format etc. copy-pasted each time) - Self-reported green tests broke at integration (E1 jsonschema regression — agent claimed PASS from worktree but main workspace failed; caught only by integration test) - Scope violations (E1 touched invoke.rs when E3 was supposed to own it; surfaced only at merge) Triplet closes all three gaps: capabilities aren't promises agents make in prose, they're enforced by gate hooks pre-exec and verified by predicates on return from main branch clean state. Schema specifies: - Capability atom layout: _capabilities/// - capability.toml frontmatter shape - text.md / gate.sh / verify.sh contracts - Role = bundle of capabilities (5 roles: read-only, explorer, edit-local, edit-shared, git-ops) - task.toml shape (orchestrator-written per spawn; parameterizes roles) - kei-agent-runtime crate contract: compose + spawn + verify + run - Initial 10-capability inventory for phase 1 - 6-question decision log with defaults - 5-phase parallel build plan (phases 1-4 parallel, ~5-7 days wall time) Open questions flagged at bottom for review before AGENT-SCHEMA-LOCKED.md. Once locked: sibling SSoTs (atoms + agents) evolve symmetrically — agents compose atoms, atoms compose agents (ultimate goal). Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/AGENT-SUBSTRATE-SCHEMA.md | 424 +++++++++++++++++++++++++++++++++ 1 file changed, 424 insertions(+) create mode 100644 docs/AGENT-SUBSTRATE-SCHEMA.md diff --git a/docs/AGENT-SUBSTRATE-SCHEMA.md b/docs/AGENT-SUBSTRATE-SCHEMA.md new file mode 100644 index 0000000..c6be320 --- /dev/null +++ b/docs/AGENT-SUBSTRATE-SCHEMA.md @@ -0,0 +1,424 @@ +# KeiSeiKit Agent Substrate Schema v1 + +**STATUS:** Draft — under review. Once approved and `AGENT-SCHEMA-LOCKED.md` is committed, this document is **LOCKED** for 3 weeks of parallel phase work. + +**PURPOSE:** Sibling SSoT to `SUBSTRATE-SCHEMA.md`. That one decomposes code primitives (atoms). This one decomposes **agent invocations** (capabilities). + +**Motivation from substrate v1 orchestration pain:** across 7 agent spawns in audit+follow-up waves, the same friction recurred — 40% prompt boilerplate, self-reported green tests that broke at integration, scope violations surfacing only after merge. Fix: capabilities become **enforced triplets**, not suggestions in freetext prompts. + +--- + +## Core concept: capability atom = triplet + +An **agent capability** is not a reusable text block. It is a **three-artifact bundle** that gives every restriction meaning across three layers: + +| Artifact | What | Who consumes | +|---|---|---| +| `text.md` | The fragment the agent reads in its composed prompt | Agent (via LLM context) | +| `gate.sh` | PreToolUse hook fragment, runs before every tool call | Claude Code harness | +| `verify.sh` | Return-time predicate, runs from main repo on composed result | Orchestrator (kei-agent-runtime) | + +**The invariant:** if any of the three fails, the capability did not hold. Self-reported compliance is not trusted — verification runs from main branch after agent return, against a clean workspace, not from the agent's worktree where its own changes are hidden state. + +--- + +## File layout + +``` +_capabilities/ +├── policy/ +│ ├── no-git-ops/ +│ │ ├── capability.toml — metadata (name, category, tools, patterns) +│ │ ├── text.md — prompt fragment +│ │ ├── gate.sh — PreToolUse hook +│ │ └── verify.sh — on-return check +│ └── … +├── scope/ +│ ├── files-whitelist/ — parameterized by task.scope.files-whitelist +│ └── files-denylist/ +├── quality/ +│ ├── constructor-pattern/ +│ ├── cargo-check-green/ +│ └── tests-green/ +├── safety/ +│ └── no-dep-bump/ +├── output/ +│ ├── report-format/ +│ └── severity-grade/ +└── tools/ + ├── read-only/ + └── cargo-only-bash/ + +_roles/ +├── read-only.toml — bundle of capabilities + tool allowlist +├── explorer.toml +├── edit-local.toml — the code-implementer role +├── edit-shared.toml +└── git-ops.toml — documented but NOT spawnable (orchestrator only) + +_primitives/_rust/kei-agent-runtime/ — new crate, phase 3 +├── Cargo.toml +├── src/ +│ ├── lib.rs +│ ├── compose.rs — task.toml + role + capabilities → prompt.md +│ ├── spawn.rs — Agent tool call with composed prompt +│ ├── gate.rs — install hooks parameterized by task scope +│ └── verify.rs — run all capability verify.sh predicates +└── tests/ + +tasks/ — ephemeral, gitignored +└── (generated per spawn: /task.toml + prompt.md) + +docs/AGENT-SUBSTRATE-SCHEMA.md — this file +docs/AGENT-ROLES.md — human-readable role matrix (generated from _roles/*.toml) +docs/AGENT-SCHEMA-LOCKED.md — lock marker +``` + +--- + +## Capability atom — `capability.toml` shape + +```toml +[capability] +name = "policy::no-git-ops" # :: namespace +category = "policy" # policy | scope | quality | safety | output | tools +version = "1.0" +description = "RULE 0.13 — orchestrator owns git, agent writes files only" +rationale = "See ~/.claude/rules/orchestrator-branch-first.md" + +[restricts] +# What this capability forbids. Runtime gate enforces. +tool-patterns = [ # matched against tool_input.command + '^git( |$)', + '^gh (repo|api /repos)', +] +tools-denied = [] # e.g. ["Edit", "Write"] for read-only + +[parameterized] +# Is this capability instance-configurable per task? +accepts = [] # e.g. ["files-whitelist"] for scope/* caps + +[text] +path = "text.md" # relative to capability dir + +[gate] +path = "gate.sh" +event = "PreToolUse:Bash" # PreToolUse:Bash | PreToolUse:Edit|Write | PreToolUse:Agent +severity = "block" # block (exit 2) | warn (exit 0 + stderr) | advisory (log only) +bypass-env = "ORCHESTRATOR_META" # optional env var to disable + +[verify] +path = "verify.sh" +run-from = "main-worktree" # main-worktree | agent-worktree | both +when = "on-return" # on-return | per-tool-call +``` + +--- + +## Capability `text.md` conventions + +- Imperative, second-person, short. +- ≤ 200 words per fragment. +- No overlap — if two capabilities say the same thing, extract into a shared one. +- Fragment stands alone — composer concatenates multiple fragments with `\n\n---\n\n` separator; fragments must not reference each other. +- Lead with the rule ("You MUST NOT X"), follow with the why ("because Y"). + +Example (`_capabilities/policy/no-git-ops/text.md`): + +```markdown +## No git operations + +You MUST NOT invoke `git`, `gh repo`, `gh api /repos`, or any shell +command that modifies git state. Orchestrator handles all git operations +(commits, branches, pushes, rebases). + +If your task requires staging a change, describe it in the return +file-list — the orchestrator will commit on your behalf. + +Bypass exists for orchestrator-meta agents only; it is not available here. +``` + +--- + +## Capability `gate.sh` contract + +Shell script invoked by Claude Code as a PreToolUse hook. Receives JSON on stdin (`tool_name`, `tool_input`, plus task context injected by orchestrator). + +Exit codes: +- `0` — allow tool call to proceed +- `2` — block tool call (permissionDecision: deny) +- any other — treated as infrastructure error, warn to stderr, allow (fail-open per existing hook convention) + +Canonical shape (see `_capabilities/policy/no-git-ops/gate.sh` as reference impl): + +```bash +#!/usr/bin/env bash +set -eu +CMD=$(jq -r '.tool_input.command // empty' 2>/dev/null || true) +[ -z "$CMD" ] && exit 0 + +# Bypass for orchestrator-meta agents +[ "${ORCHESTRATOR_META:-0}" = "1" ] && exit 0 + +# Load restrictions from capability.toml at runtime +PATTERNS=$(capability_patterns policy::no-git-ops tool-patterns) +for pat in $PATTERNS; do + if echo "$CMD" | grep -qE "$pat"; then + cat </tmp/agent-$AGENT_ID-check.log 2>&1; then + echo "cargo check --workspace FAILED — agent claims regression not present in main workspace" >&2 + tail -5 /tmp/agent-$AGENT_ID-check.log >&2 + exit 1 +fi +exit 0 +``` + +This runs from `$MAIN_REPO` — not from the agent's worktree — because agent-local passing ≠ main-integration passing (the lesson from E1's jsonschema regression in audit wave). + +--- + +## Role — `_roles/.toml` shape + +```toml +[role] +name = "edit-local" +display-name = "code-implementer (local edit scope)" +description = "Write code + run cargo check/test + emit report. No git, no workspace touches." + +[capabilities] +# Ordered list — text.md fragments concatenated in this order +required = [ + "policy::no-git-ops", + "scope::files-whitelist", + "scope::files-denylist", + "quality::constructor-pattern", + "quality::cargo-check-green", + "quality::tests-green", + "safety::no-dep-bump", + "output::report-format", +] + +[tools] +# Tool allowlist — anything not in this list is denied +allowed = ["Read", "Write", "Edit", "Glob", "Grep", "Bash"] +# Bash further restricted by quality/tools atoms +bash-patterns-allowed = ['^cargo( |$)', '^mkdir( |$)', '^rm -rf /tmp/'] + +[escalation] +policy = "ask-via-return" # ask-via-return | orchestrator-notify | fail-fast +``` + +--- + +## Task spec — `task.toml` shape (orchestrator writes per spawn) + +```toml +[task] +role = "edit-local" +agent-id = "abc123…" # allocated by kei-ledger fork +parent-agent = null # or parent ID for nested + +[scope] +# Parameterizes scope::files-whitelist + scope::files-denylist +files-whitelist = [ + "_primitives/_rust/kei-forge/**", +] +files-denylist = [ + "_primitives/_rust/Cargo.toml", + "_primitives/_rust/Cargo.lock", + "scripts/**", + ".github/**", +] + +[verification] +# Parameterizes quality/* caps +cargo-check-crates = ["kei-forge"] +cargo-test-crates = ["kei-forge"] +test-count-min = 44 + +[output] +# Parameterizes output/report-format +report-fields-required = ["files-touched", "cargo-check", "cargo-test", "loc-delta"] + +[body] +# Free-text task instructions, concatenated AFTER role capability fragments +text = """ +Replace shell-out with pure-Rust templating. … +""" +``` + +--- + +## Runtime execution contract + +`kei-agent-runtime` crate provides: + +```bash +# Compose prompt from task spec +kei-agent-runtime compose +# → writes /prompt.md + +# Spawn agent with composed prompt + install gates + record ledger +kei-agent-runtime spawn +# → returns agent-id; background-task notification semantics + +# Run all capability verify predicates against agent's return +kei-agent-runtime verify +# → exit 0 if all hold, non-zero with report of violations + +# One-shot helper: compose + spawn + verify +kei-agent-runtime run +``` + +Execution flow: + +``` +1. orchestrator writes task.toml +2. `kei-agent-runtime compose` → prompt.md +3. `kei-agent-runtime spawn` → + a. kei-ledger fork + b. install PreToolUse gates parameterized by task.scope + c. Agent tool call with isolation=worktree + composed prompt +4. [agent executes] +5. `kei-agent-runtime verify` → + a. run each capability verify.sh from MAIN repo (not worktree) + b. collect all violations + c. exit 0 if empty, non-zero with report +6. orchestrator decides: merge | reject + respawn | reject + rollback +``` + +--- + +## Initial capability atom inventory (phase 1 builds these 10) + +| Name | Category | text / gate / verify | Core restriction | +|---|---|---|---| +| `policy::no-git-ops` | policy | ✓/✓/✓ | Block `git`, `gh repo`, `gh api /repos` | +| `scope::files-whitelist` | scope | ✓/✓/✓ | PreToolUse:Edit\|Write denies paths outside whitelist; on-return git diff check | +| `scope::files-denylist` | scope | ✓/✓/✓ | PreToolUse:Edit\|Write denies paths in denylist (overrides whitelist) | +| `quality::constructor-pattern` | quality | ✓/—/✓ | On return: no file > 200 LOC, no fn > 30 LOC | +| `quality::cargo-check-green` | quality | ✓/—/✓ | On return: `cargo check --workspace` from MAIN passes | +| `quality::tests-green` | quality | ✓/—/✓ | On return: `cargo test -p ` passes, count ≥ task min | +| `safety::no-dep-bump` | safety | ✓/✓/✓ | PreToolUse:Edit on Cargo.toml denies unless task opts in; on-return lock-diff check | +| `output::report-format` | output | ✓/—/✓ | On return: parse report, assert required fields present | +| `tools::read-only` | tools | ✓/✓/— | PreToolUse denies Edit/Write entirely | +| `tools::cargo-only-bash` | tools | ✓/✓/— | PreToolUse:Bash denies unless command matches allowlist pattern | + +--- + +## Initial role inventory (phase 2 builds these 5) + +| Role | Capabilities | Tools | +|---|---|---| +| `read-only` | tools::read-only + output::report-format + output::severity-grade | Read / Glob / Grep / WebFetch | +| `explorer` | read-only caps + tools::cargo-only-bash (for `cargo check`) | + Bash-cargo | +| `edit-local` | policy::no-git-ops + scope::* + quality::* + safety::no-dep-bump + output::report-format | + Edit / Write / Bash-cargo | +| `edit-shared` | edit-local caps + permission for specified SSoT patterns | Same + SSoT paths | +| `git-ops` | Documented-only, NOT spawnable (orchestrator holds this) | All | + +--- + +## Decision log — choose defaults before lock + +| # | Question | Proposed default | Rationale | +|---|---|---|---| +| 1 | 1 file per capability (bundled) OR 3 separate files (text/gate/verify)? | **3 separate** | Each artifact has different consumer; separate files = clean diff, independent edits | +| 2 | Gate language: Bash or Rust? | **Bash for phase 1** | Integrates with existing hook system; Rust upgrade path exists via `kei-agent-runtime` compiling hook drivers | +| 3 | Verify language: Bash or Rust? | **Bash for phase 1** | Same reason. Predicates are usually simple (run a cmd, check exit). Complex ones can shell-out to Rust helpers. | +| 4 | Task spec format: TOML or YAML? | **TOML** | Consistent with Cargo.toml + `_primitives/MANIFEST.toml` + locked atom schema | +| 5 | Capability ID separator: `::` or `/`? | **`::`** | Consistent with atom IDs. `policy::no-git-ops` reads Rust-native. | +| 6 | Gated file path: `_capabilities///` or flat `_capabilities/-/`? | **Nested** | Scales to 50+ capabilities, category browsability | +| 7 | Text fragment max length | **200 words per capability** | Agent context budget; cap forces atomicity | +| 8 | Verify runs from | **main-worktree by default, per-capability override** | Catches the E1-jsonschema-regression class of bug | + +--- + +## Phase plan (post-lock, parallel) + +| Phase | What | Depends on | Agent | Estimate | +|---|---|---|---|---| +| 0 | This schema + lock | — | me | 0.5 day | +| 1 | Capability atom library — 10 × (text.md, gate.sh, verify.sh, capability.toml) = 40 files | phase 0 | 1 code-implementer | 2 days | +| 2 | Role matrix — 5 TOML + `docs/AGENT-ROLES.md` | phase 0 | 1 code-implementer | 0.5 day | +| 3 | `kei-agent-runtime` crate — compose + spawn + verify + CLI | phase 0 | 1 code-implementer | 3-4 days | +| 4 | Hook wiring — `~/.claude/hooks/agent-capability-gate.sh` meta-hook that reads task spec and invokes capability gates | phases 1 + 3 | 1 code-implementer | 1 day | +| 5 | Migration — 5 custom agents (code-implementer / critic / architect / security-auditor / validator) move from hand-rolled prompts to role+task invocation | phases 1+2+3+4 | 1 code-implementer | 1 day | + +**Phases 1, 2, 3 can start immediately after lock** (different dirs, zero overlap). +Phase 4 depends on 1+3. +Phase 5 depends on everything. + +Total wall-time with parallel: ~5-7 days from lock. + +--- + +## Integration with substrate v1 + +This schema is **additive** to locked `SUBSTRATE-SCHEMA.md`. The two SSoTs sit side by side: + +- `SUBSTRATE-SCHEMA.md` — how code decomposes into atoms (locked 2026-04-22) +- `AGENT-SUBSTRATE-SCHEMA.md` — how agent invocation decomposes into capabilities (this doc) + +Cross-ref: agent capability `quality::cargo-check-green` verifies that atoms compiled; atom agents produced via `kei-forge` can themselves be invoked through `kei-runtime` (atom substrate) OR composed into role definitions (agent substrate). + +Eventually (post-both-locks): **agents compose atoms, atoms compose agents**. Symmetric substrates. + +--- + +## Lock declaration + +Once this document is approved by the user and `docs/AGENT-SCHEMA-LOCKED.md` is committed, the capability-triplet shape + role shape + task-spec shape + runtime contract are **immutable for 3 weeks** (shorter lock than atom substrate because agent substrate is greenfield, expected revisions). + +Breaking changes during lock require: +1. Explicit revocation by user +2. All parallel phase agents paused +3. Lock marker amended with revocation reason +4. `kei-ledger` row: bypass reason + revocation timestamp + +Non-breaking additions (new capability atoms beyond the initial 10, new roles, new parameterized fields on existing capabilities) are allowed during lock. + +--- + +## Open questions for review + +Before we lock, call out things that might be wrong: + +1. **3-file triplet vs bundled .md** — is splitting text/gate/verify into 3 files the right granularity? Con: more files; pro: independent versioning + diff. +2. **Gate language Bash** — ok for phase 1, or do you want Rust from day 1? Bash is quick but loses type safety. Rust phase-1 adds ~2 days to phase 3. +3. **`main-worktree` verify default** — catches integration regressions, but means verify can't start until orchestrator has a clean main. Alt: `both` (run in worktree first for speed, then in main for correctness). +4. **Task spec ephemerality** — `tasks/` gitignored. Should we persist them for archaeology? Ledger already tracks agent-id → spec-sha, so the spec is recoverable from `kei-sage` or `git log` if we do persist. +5. **Capability atoms I didn't include in the initial 10** — should any be added? Candidates: `safety::no-mass-delete`, `output::ledger-row-required`, `quality::no-warnings`, `scope::no-rule-edits`. +6. **Role `git-ops` documented but not spawnable** — do we document it in role TOML or in a rule file? + +Defaults resolved → lock → spawn phases 1-4 in parallel. From a25c282dca7ba2bcc08bba9ef43b077ce5820f96 Mon Sep 17 00:00:00 2001 From: Parfii-bot Date: Thu, 23 Apr 2026 02:05:21 +0800 Subject: [PATCH 2/2] =?UTF-8?q?feat(agent-substrate):=20LOCK=20schema=20?= =?UTF-8?q?=E2=80=94=208=20decisions=20resolved,=203-phase=20parallel=20wi?= =?UTF-8?q?ndow=20opens?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Resolved per user review 2026-04-23: - Layout: declarative (capability.toml + text.md) + Rust modules in kei-agent-runtime (not 3 bash files — user pushed for Rust) - Gate: Rust via kei-capability binary, shell hook = 3-line exec glue - Verify: Rust same binary, subcommand verify - Config: TOML (user asked discount between TOML vs YAML — explained type-safety + Cargo-native parsing wins) - Capability ID: :: separator (confirmed) - Nested path layout - 200 words/capability text cap - Verify: worktree short-circuit → simulated-merge (catches E1-jsonschema-class integration regressions before main merge) Phase 3 revised up from 3-4 days to 5-6 due to Rust gate/verify logic + simulated-merge executor. Offset by phase 4 dropping from 1 day to 0.5 (shell hooks now thin glue). 3 phases parallelizable immediately after this lock: - Phase 1: 20 declarative files (capability.toml + text.md × 10) - Phase 2: 5 role TOML + docs/AGENT-ROLES.md - Phase 3: kei-agent-runtime + kei-capability binaries + 14 Rust capability modules Phase 4 + 5 sequential after. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/AGENT-SCHEMA-LOCKED.md | 41 ++++ docs/AGENT-SUBSTRATE-SCHEMA.md | 361 ++++++++++++++++++++++----------- 2 files changed, 283 insertions(+), 119 deletions(-) create mode 100644 docs/AGENT-SCHEMA-LOCKED.md diff --git a/docs/AGENT-SCHEMA-LOCKED.md b/docs/AGENT-SCHEMA-LOCKED.md new file mode 100644 index 0000000..71e2b6a --- /dev/null +++ b/docs/AGENT-SCHEMA-LOCKED.md @@ -0,0 +1,41 @@ +# Agent Substrate Schema — LOCKED + +**Locked on:** 2026-04-23 +**Locked from commit:** feat/agent-substrate-schema (decisions resolved) +**Schema SSoT:** [AGENT-SUBSTRATE-SCHEMA.md](./AGENT-SUBSTRATE-SCHEMA.md) +**Sibling SSoT:** [SUBSTRATE-SCHEMA.md](./SUBSTRATE-SCHEMA.md) (atoms, locked 2026-04-22) + +## Lock scope + +The capability triplet contract + role shape + task.toml shape + kei-agent-runtime Rust trait + CLI surface defined in `AGENT-SUBSTRATE-SCHEMA.md` are **immutable for 3 weeks** of parallel phase work (through ~2026-05-14). + +## What "locked" means + +**Non-breaking during lock** (allowed): +- New capability atoms beyond the initial 10 +- New roles beyond the initial 5 +- New optional fields on `capability.toml` / `role.toml` / `task.toml` +- New verify `run-mode` values +- New gate `severity` levels + +**Breaking during lock** (requires revocation): +- Changing capability ID separator `::` +- Changing the Capability trait signature +- Switching capability definitions from Rust to another language +- Changing TOML → another config format +- Changing capability path layout `_capabilities///` +- Changing the 8-decision values in §Decision log + +## Phases under lock + +| Phase | Branch | Start | +|---|---|---| +| 1 | `feat/phase-1-capability-library` | on lock | +| 2 | `feat/phase-2-role-matrix` | on lock | +| 3 | `feat/phase-3-kei-agent-runtime` | on lock | +| 4 | `feat/phase-4-hook-wiring` | after 1+3 | +| 5 | `feat/phase-5-agent-migration` | after 1+2+3+4 | + +## Unlock + +Automatic unlock on **2026-05-14** OR on `AGENT-SCHEMA-UNLOCKED.md` commit by user (whichever comes first). Post-unlock, schema v2 can iterate based on what the 5 phases learned. diff --git a/docs/AGENT-SUBSTRATE-SCHEMA.md b/docs/AGENT-SUBSTRATE-SCHEMA.md index c6be320..7509122 100644 --- a/docs/AGENT-SUBSTRATE-SCHEMA.md +++ b/docs/AGENT-SUBSTRATE-SCHEMA.md @@ -1,6 +1,6 @@ # KeiSeiKit Agent Substrate Schema v1 -**STATUS:** Draft — under review. Once approved and `AGENT-SCHEMA-LOCKED.md` is committed, this document is **LOCKED** for 3 weeks of parallel phase work. +**STATUS:** Decisions resolved 2026-04-23 — see updated Decision log at bottom. LOCK active upon `AGENT-SCHEMA-LOCKED.md` commit. 3-week parallel phase window. **PURPOSE:** Sibling SSoT to `SUBSTRATE-SCHEMA.md`. That one decomposes code primitives (atoms). This one decomposes **agent invocations** (capabilities). @@ -10,64 +10,94 @@ ## Core concept: capability atom = triplet -An **agent capability** is not a reusable text block. It is a **three-artifact bundle** that gives every restriction meaning across three layers: +An **agent capability** is not a reusable text block. It is a **declarative bundle + Rust implementation** that gives every restriction meaning across three layers: -| Artifact | What | Who consumes | +| Artifact | Format | Who consumes | |---|---|---| -| `text.md` | The fragment the agent reads in its composed prompt | Agent (via LLM context) | -| `gate.sh` | PreToolUse hook fragment, runs before every tool call | Claude Code harness | -| `verify.sh` | Return-time predicate, runs from main repo on composed result | Orchestrator (kei-agent-runtime) | +| `capability.toml` | TOML declarative metadata (name, category, patterns, parameters) | kei-agent-runtime at compose + lint time | +| `text.md` | Markdown prompt fragment | Agent (via LLM context) | +| Rust module `gates/.rs` | Rust `impl Capability` trait | `kei-capability check` binary at PreToolUse | +| Rust module `verifies/.rs` | Rust `impl Capability` trait | `kei-capability verify` binary at on-return | -**The invariant:** if any of the three fails, the capability did not hold. Self-reported compliance is not trusted — verification runs from main branch after agent return, against a clean workspace, not from the agent's worktree where its own changes are hidden state. +The two Rust modules live in `_primitives/_rust/kei-agent-runtime/src/` — one compilation unit, one registry, `cargo test` on all gates/verifies at once. Shell hooks are 3-line glue that `exec`s the binary. + +**The invariant:** if any of the four artifacts is missing or fails, the capability did not hold. Self-reported compliance is not trusted — verification runs via **worktree short-circuit → simulated merge** pattern (see §Verify execution below) after agent return, catching integration regressions before merge to main. --- ## File layout ``` -_capabilities/ +_capabilities/ — DECLARATIVE artefacts (phase 1 writes these) ├── policy/ -│ ├── no-git-ops/ -│ │ ├── capability.toml — metadata (name, category, tools, patterns) -│ │ ├── text.md — prompt fragment -│ │ ├── gate.sh — PreToolUse hook -│ │ └── verify.sh — on-return check -│ └── … +│ └── no-git-ops/ +│ ├── capability.toml +│ └── text.md ├── scope/ -│ ├── files-whitelist/ — parameterized by task.scope.files-whitelist -│ └── files-denylist/ +│ ├── files-whitelist/{capability.toml, text.md} +│ └── files-denylist/{capability.toml, text.md} ├── quality/ -│ ├── constructor-pattern/ -│ ├── cargo-check-green/ -│ └── tests-green/ +│ ├── constructor-pattern/{capability.toml, text.md} +│ ├── cargo-check-green/{capability.toml, text.md} +│ └── tests-green/{capability.toml, text.md} ├── safety/ -│ └── no-dep-bump/ +│ └── no-dep-bump/{capability.toml, text.md} ├── output/ -│ ├── report-format/ -│ └── severity-grade/ +│ ├── report-format/{capability.toml, text.md} +│ └── severity-grade/{capability.toml, text.md} └── tools/ - ├── read-only/ - └── cargo-only-bash/ + ├── read-only/{capability.toml, text.md} + └── cargo-only-bash/{capability.toml, text.md} -_roles/ -├── read-only.toml — bundle of capabilities + tool allowlist +_roles/ — DECLARATIVE (phase 2 writes these) +├── read-only.toml ├── explorer.toml -├── edit-local.toml — the code-implementer role +├── edit-local.toml ├── edit-shared.toml -└── git-ops.toml — documented but NOT spawnable (orchestrator only) +└── git-ops.toml — documented; NOT spawnable (orchestrator-only) -_primitives/_rust/kei-agent-runtime/ — new crate, phase 3 +_primitives/_rust/kei-agent-runtime/ — BINARY (phase 3 writes this) ├── Cargo.toml ├── src/ -│ ├── lib.rs -│ ├── compose.rs — task.toml + role + capabilities → prompt.md -│ ├── spawn.rs — Agent tool call with composed prompt -│ ├── gate.rs — install hooks parameterized by task scope -│ └── verify.rs — run all capability verify.sh predicates +│ ├── lib.rs — exports Capability trait + registry +│ ├── main.rs — CLI: compose | spawn | verify | run +│ ├── compose.rs — task.toml + role + capabilities → prompt.md +│ ├── spawn.rs — Agent-tool invocation with composed prompt +│ ├── verify.rs — worktree short-circuit → simulated merge +│ ├── simulated_merge.rs — create temp branch + apply diff + run checks +│ ├── registry.rs — &str → Box dispatch +│ ├── gates/ — PreToolUse logic +│ │ ├── mod.rs +│ │ ├── policy_no_git_ops.rs +│ │ ├── scope_files_whitelist.rs +│ │ ├── scope_files_denylist.rs +│ │ ├── safety_no_dep_bump.rs +│ │ ├── tools_read_only.rs +│ │ └── tools_cargo_only_bash.rs — 6 gates +│ └── verifies/ — on-return logic +│ ├── mod.rs +│ ├── quality_constructor_pattern.rs +│ ├── quality_cargo_check_green.rs +│ ├── quality_tests_green.rs +│ ├── safety_no_dep_bump.rs +│ ├── scope_files_whitelist.rs +│ ├── scope_files_denylist.rs +│ ├── output_report_format.rs +│ └── output_severity_grade.rs — 8 verifies └── tests/ +_primitives/_rust/kei-capability/ — BINARY (phase 3) +├── Cargo.toml — depends on kei-agent-runtime +└── src/main.rs — clap CLI: + kei-capability check (stdin JSON, exit 0|2) + kei-capability verify (env-driven, exit 0 or fail) + +hooks/ — 3-line shell glue (phase 4) +├── agent-capability-check.sh — `exec kei-capability check "$CAP_NAME" "$@"` +└── agent-capability-verify.sh — called by orchestrator post-agent + tasks/ — ephemeral, gitignored -└── (generated per spawn: /task.toml + prompt.md) +└── /{task.toml, prompt.md} docs/AGENT-SUBSTRATE-SCHEMA.md — this file docs/AGENT-ROLES.md — human-readable role matrix (generated from _roles/*.toml) @@ -102,17 +132,23 @@ accepts = [] # e.g. ["files-whitelist"] for scope/* cap path = "text.md" # relative to capability dir [gate] -path = "gate.sh" -event = "PreToolUse:Bash" # PreToolUse:Bash | PreToolUse:Edit|Write | PreToolUse:Agent -severity = "block" # block (exit 2) | warn (exit 0 + stderr) | advisory (log only) -bypass-env = "ORCHESTRATOR_META" # optional env var to disable +# Rust module path inside kei-agent-runtime — registry dispatches by capability.name +rust-module = "gates::policy_no_git_ops" # or empty if capability has no gate (verify-only) +event = "PreToolUse:Bash" # PreToolUse:Bash | PreToolUse:Edit|Write | PreToolUse:Agent +severity = "block" # block (exit 2) | warn (exit 0 + stderr) | advisory (log only) +bypass-env = "ORCHESTRATOR_META" # optional env var to disable [verify] -path = "verify.sh" -run-from = "main-worktree" # main-worktree | agent-worktree | both -when = "on-return" # on-return | per-tool-call +rust-module = "verifies::policy_no_git_ops" # or empty if gate-only +run-mode = "simulated-merge" # worktree | simulated-merge | both +when = "on-return" # on-return | per-tool-call ``` +**`run-mode` values:** +- `worktree` — run predicate inside the agent's worktree (fastest; what the agent saw) +- `simulated-merge` — orchestrator creates `test-merge/` branch off main, applies agent diff, runs predicate from there (catches integration regressions of the E1-jsonschema-class — see §Verify execution) +- `both` — worktree first (fail-fast), then simulated-merge (integration guarantee). Default for `quality::*` capabilities. + --- ## Capability `text.md` conventions @@ -140,71 +176,156 @@ Bypass exists for orchestrator-meta agents only; it is not available here. --- -## Capability `gate.sh` contract +## Capability trait contract (Rust) -Shell script invoked by Claude Code as a PreToolUse hook. Receives JSON on stdin (`tool_name`, `tool_input`, plus task context injected by orchestrator). +All gates and verifies implement the same trait, dispatched by string name. Registry in `kei-agent-runtime/src/registry.rs` maps `"policy::no-git-ops"` to `Box`. -Exit codes: -- `0` — allow tool call to proceed -- `2` — block tool call (permissionDecision: deny) -- any other — treated as infrastructure error, warn to stderr, allow (fail-open per existing hook convention) +```rust +// kei-agent-runtime/src/capability.rs -Canonical shape (see `_capabilities/policy/no-git-ops/gate.sh` as reference impl): +pub trait Capability: Send + Sync { + fn name(&self) -> &'static str; -```bash -#!/usr/bin/env bash -set -eu -CMD=$(jq -r '.tool_input.command // empty' 2>/dev/null || true) -[ -z "$CMD" ] && exit 0 + /// PreToolUse gate. Called by `kei-capability check ` binary. + /// Receives the hook JSON payload from Claude Code on stdin. + /// Returns Allow / Deny{reason} / NotApplicable. + fn check(&self, ctx: &GateContext) -> GateDecision { + GateDecision::NotApplicable // default: no gate, verify-only + } -# Bypass for orchestrator-meta agents -[ "${ORCHESTRATOR_META:-0}" = "1" ] && exit 0 + /// On-return verification predicate. Called by `kei-capability verify `. + /// Receives task context (agent-id, worktree path, main repo, task.toml values). + /// Returns Pass / Fail{reason}. + fn verify(&self, ctx: &VerifyContext) -> VerifyResult { + VerifyResult::Pass // default: no verify, gate-only + } +} -# Load restrictions from capability.toml at runtime -PATTERNS=$(capability_patterns policy::no-git-ops tool-patterns) -for pat in $PATTERNS; do - if echo "$CMD" | grep -qE "$pat"; then - cat < { + pub tool_name: &'a str, + pub tool_input: &'a Value, + pub task: &'a TaskSpec, // parsed task.toml + pub env: &'a HashMap, +} + +pub enum GateDecision { + Allow, + Deny { reason: String }, + NotApplicable, +} + +pub struct VerifyContext<'a> { + pub agent_id: &'a str, + pub task: &'a TaskSpec, + pub worktree_path: &'a Path, + pub main_repo: &'a Path, + pub run_mode: RunMode, // Worktree | SimulatedMerge | Both +} + +pub enum VerifyResult { + Pass, + Fail { reason: String, detail: Option }, +} ``` -`capability_patterns` is a small helper shipped with the runtime (reads the capability.toml and emits the list). +Example implementation (`_primitives/_rust/kei-agent-runtime/src/gates/policy_no_git_ops.rs`): ---- +```rust +use crate::capability::*; +use regex::Regex; +use once_cell::sync::Lazy; -## Capability `verify.sh` contract +pub struct NoGitOps; -Shell script invoked by orchestrator after agent return. Receives env: -- `$AGENT_ID` — the agent's internal ID -- `$TASK_TOML` — path to the task spec the agent was given -- `$WORKTREE_PATH` — path to the agent's worktree -- `$MAIN_REPO` — path to the orchestrator's main repo -- plus any capability-specific env from `[parameterized]` +static GIT_PATTERNS: Lazy> = Lazy::new(|| vec![ + Regex::new(r"(?m)(?:^|[;&|]|\s)git(?:\s|$)").unwrap(), + Regex::new(r"(?m)(?:^|[;&|]|\s)gh\s+repo").unwrap(), + Regex::new(r"(?m)(?:^|[;&|]|\s)gh\s+api\s+/?repos").unwrap(), +]); -Exit codes: -- `0` — predicate holds, capability verified -- non-zero — violation. stderr MUST contain a one-line human-readable description (≤ 200 chars). +impl Capability for NoGitOps { + fn name(&self) -> &'static str { "policy::no-git-ops" } -Example (`_capabilities/quality/cargo-check-green/verify.sh`): - -```bash -#!/usr/bin/env bash -set -eu -cd "$MAIN_REPO/_primitives/_rust" -if ! cargo check --workspace >/tmp/agent-$AGENT_ID-check.log 2>&1; then - echo "cargo check --workspace FAILED — agent claims regression not present in main workspace" >&2 - tail -5 /tmp/agent-$AGENT_ID-check.log >&2 - exit 1 -fi -exit 0 + fn check(&self, ctx: &GateContext) -> GateDecision { + if ctx.tool_name != "Bash" { return GateDecision::NotApplicable; } + if ctx.env.get("ORCHESTRATOR_META").map(|v| v == "1").unwrap_or(false) { + return GateDecision::Allow; + } + let cmd = ctx.tool_input.get("command").and_then(|v| v.as_str()).unwrap_or(""); + for pat in GIT_PATTERNS.iter() { + if pat.is_match(cmd) { + return GateDecision::Deny { + reason: format!("RULE 0.13 — git operation blocked (pattern {})", pat.as_str()), + }; + } + } + GateDecision::Allow + } +} ``` -This runs from `$MAIN_REPO` — not from the agent's worktree — because agent-local passing ≠ main-integration passing (the lesson from E1's jsonschema regression in audit wave). +Example verify (`_primitives/_rust/kei-agent-runtime/src/verifies/quality_cargo_check_green.rs`): + +```rust +use crate::capability::*; +use std::process::Command; + +pub struct CargoCheckGreen; + +impl Capability for CargoCheckGreen { + fn name(&self) -> &'static str { "quality::cargo-check-green" } + + fn verify(&self, ctx: &VerifyContext) -> VerifyResult { + let run_dir = match ctx.run_mode { + RunMode::Worktree => ctx.worktree_path, + RunMode::SimulatedMerge => &ctx.simulated_merge_path(), + RunMode::Both => unreachable!("runtime runs `both` as two sequential calls"), + }; + let out = Command::new("cargo") + .arg("check") + .arg("--workspace") + .current_dir(run_dir.join("_primitives/_rust")) + .output(); + match out { + Err(e) => VerifyResult::Fail { + reason: "cargo invocation failed".to_string(), + detail: Some(e.to_string()), + }, + Ok(o) if !o.status.success() => { + let tail = String::from_utf8_lossy(&o.stderr).lines().rev().take(5).collect::>(); + VerifyResult::Fail { + reason: "cargo check --workspace FAILED — agent-local green ≠ integration green".to_string(), + detail: Some(tail.into_iter().rev().collect::>().join("\n")), + } + } + Ok(_) => VerifyResult::Pass, + } + } +} +``` + +## Verify execution — worktree → simulated merge + +The orchestrator runs verification in **two sequential passes** for `run-mode = "both"`: + +``` +Pass 1 — worktree (fail-fast) + cd + run capability.verify(RunMode::Worktree) + if Fail → reject immediately, don't bother with pass 2 + +Pass 2 — simulated-merge (integration guarantee) + git checkout -b test-merge/ main # from MAIN repo, not worktree + git apply # apply agent's changes on clean main + cd + run capability.verify(RunMode::SimulatedMerge) + if Fail → reject with regression report + if Pass → safe to merge, orchestrator proceeds +``` + +Why both: agent's worktree passing doesn't mean merged-main passing. E1's jsonschema regression was green in worktree (no real atoms there) but broke main integration (real atom schemas triggered the 0.17→0.18 breaking change). Simulated merge catches this class **before** it lands on main. + +Implementation lives in `kei-agent-runtime/src/simulated_merge.rs` — creates a temp worktree via `git worktree add`, applies diff, runs verify, cleans up. --- @@ -349,18 +470,20 @@ Execution flow: --- -## Decision log — choose defaults before lock +## Decision log — resolved 2026-04-23 -| # | Question | Proposed default | Rationale | +| # | Question | Decision | Rationale | |---|---|---|---| -| 1 | 1 file per capability (bundled) OR 3 separate files (text/gate/verify)? | **3 separate** | Each artifact has different consumer; separate files = clean diff, independent edits | -| 2 | Gate language: Bash or Rust? | **Bash for phase 1** | Integrates with existing hook system; Rust upgrade path exists via `kei-agent-runtime` compiling hook drivers | -| 3 | Verify language: Bash or Rust? | **Bash for phase 1** | Same reason. Predicates are usually simple (run a cmd, check exit). Complex ones can shell-out to Rust helpers. | -| 4 | Task spec format: TOML or YAML? | **TOML** | Consistent with Cargo.toml + `_primitives/MANIFEST.toml` + locked atom schema | -| 5 | Capability ID separator: `::` or `/`? | **`::`** | Consistent with atom IDs. `policy::no-git-ops` reads Rust-native. | -| 6 | Gated file path: `_capabilities///` or flat `_capabilities/-/`? | **Nested** | Scales to 50+ capabilities, category browsability | -| 7 | Text fragment max length | **200 words per capability** | Agent context budget; cap forces atomicity | -| 8 | Verify runs from | **main-worktree by default, per-capability override** | Catches the E1-jsonschema-regression class of bug | +| 1 | Layout per capability | **Declarative bundle (`capability.toml` + `text.md`) + Rust modules in runtime crate** | Declarative artefacts live with capability; executable logic lives with its sibling capabilities in one Rust crate for shared tests + type safety | +| 2 | Gate language | **Rust** via `kei-capability check ` binary; shell hook = 3-line `exec` glue | Type safety, unit tests, one compilation unit for all gates. Shell remains only as Claude-Code-hook-protocol adapter | +| 3 | Verify language | **Rust** same binary, `kei-capability verify ` subcommand | Same reasoning. Cargo output parsing, LOC checks, diff analysis — all better in Rust | +| 4 | Config format (capability.toml / role.toml / task.toml) | **TOML** | Consistent with Cargo ecosystem. YAML reserved only for locked atom `.md` frontmatter (immutable under atom substrate v1 lock) | +| 5 | Capability ID separator | **`::`** | Consistent with atom IDs. Rust-native | +| 6 | Capability path layout | **Nested `_capabilities///`** | Scales to 50+ capabilities, category browsability | +| 7 | Text fragment max | **200 words per capability** | Agent context budget; forces atomicity | +| 8 | Verify execution | **worktree short-circuit → simulated-merge** (default `both` for `quality::*`) | Catches E1-jsonschema-class integration regressions before main merge. See §Verify execution | + +**Locked values:** all 8 above. Breaking changes require explicit user revocation + all-phases sync. --- @@ -368,18 +491,18 @@ Execution flow: | Phase | What | Depends on | Agent | Estimate | |---|---|---|---|---| -| 0 | This schema + lock | — | me | 0.5 day | -| 1 | Capability atom library — 10 × (text.md, gate.sh, verify.sh, capability.toml) = 40 files | phase 0 | 1 code-implementer | 2 days | -| 2 | Role matrix — 5 TOML + `docs/AGENT-ROLES.md` | phase 0 | 1 code-implementer | 0.5 day | -| 3 | `kei-agent-runtime` crate — compose + spawn + verify + CLI | phase 0 | 1 code-implementer | 3-4 days | -| 4 | Hook wiring — `~/.claude/hooks/agent-capability-gate.sh` meta-hook that reads task spec and invokes capability gates | phases 1 + 3 | 1 code-implementer | 1 day | -| 5 | Migration — 5 custom agents (code-implementer / critic / architect / security-auditor / validator) move from hand-rolled prompts to role+task invocation | phases 1+2+3+4 | 1 code-implementer | 1 day | +| 0 | This schema + lock marker | — | me | 0.5 day ✓ | +| 1 | Capability library — 10 × (`capability.toml` + `text.md`) = **20 declarative files** | phase 0 | 1 code-implementer | 1-2 days | +| 2 | Role matrix — 5 `_roles/*.toml` + auto-gen `docs/AGENT-ROLES.md` | phase 0 | 1 code-implementer | 0.5 day | +| 3 | `kei-agent-runtime` + `kei-capability` binaries — compose/spawn/verify CLI + 6 gate modules + 8 verify modules + registry + simulated-merge executor | phase 0 | 1 code-implementer | 5-6 days | +| 4 | Hook wiring — `agent-capability-check.sh` + `agent-capability-verify.sh` 3-line glue + settings.json registration | phases 1+3 | 1 code-implementer | 0.5 day | +| 5 | Migration — 5 custom agents (code-implementer / critic / architect / security-auditor / validator) adopt role+task-spec invocation | phases 1+2+3+4 | 1 code-implementer | 1 day | -**Phases 1, 2, 3 can start immediately after lock** (different dirs, zero overlap). +**Phases 1, 2, 3 start in parallel immediately after lock** (different dirs, zero file overlap). Phase 4 depends on 1+3. Phase 5 depends on everything. -Total wall-time with parallel: ~5-7 days from lock. +Total wall-time with parallel phases 1+2+3: **~7-8 days from lock** (phase 3 is critical path). --- @@ -410,15 +533,15 @@ Non-breaking additions (new capability atoms beyond the initial 10, new roles, n --- -## Open questions for review +## Deferred extension candidates (non-breaking post-lock) -Before we lock, call out things that might be wrong: +Capability atoms NOT in the initial 10 but good follow-up PRs (non-breaking additions during lock window): -1. **3-file triplet vs bundled .md** — is splitting text/gate/verify into 3 files the right granularity? Con: more files; pro: independent versioning + diff. -2. **Gate language Bash** — ok for phase 1, or do you want Rust from day 1? Bash is quick but loses type safety. Rust phase-1 adds ~2 days to phase 3. -3. **`main-worktree` verify default** — catches integration regressions, but means verify can't start until orchestrator has a clean main. Alt: `both` (run in worktree first for speed, then in main for correctness). -4. **Task spec ephemerality** — `tasks/` gitignored. Should we persist them for archaeology? Ledger already tracks agent-id → spec-sha, so the spec is recoverable from `kei-sage` or `git log` if we do persist. -5. **Capability atoms I didn't include in the initial 10** — should any be added? Candidates: `safety::no-mass-delete`, `output::ledger-row-required`, `quality::no-warnings`, `scope::no-rule-edits`. -6. **Role `git-ops` documented but not spawnable** — do we document it in role TOML or in a rule file? +- `safety::no-mass-delete` — PreToolUse denies `rm -rf` on more than N files +- `output::ledger-row-required` — verify agent emitted ledger row per RULE 0.12 +- `quality::no-warnings` — `cargo build --workspace` with `-D warnings` +- `scope::no-rule-edits` — denies edits to `~/.claude/rules/*.md` unless orchestrator-meta -Defaults resolved → lock → spawn phases 1-4 in parallel. +Role `git-ops` — documented in `docs/AGENT-ROLES.md` only; `_roles/git-ops.toml` has `spawnable = false` field. Orchestrator code refuses to spawn it. Exists for documentation of "who can do git" boundary. + +Task spec persistence: task.toml files are ephemeral (gitignored under `tasks/`). Ledger row includes spec-SHA so historical specs are recoverable from `kei-sage` archive if someone wants cold-storage replay.