feat(skills): /test-matrix 5-phase pipeline

Adds hub-and-spoke testing-matrix skill complementing /test-gen: SKILL.md index + phase-1-intake (language/coverage/critical/CI), phase-2-matrix (test-type × language multi-select), phase-3-scaffold (config + corpus + fixtures per cell), phase-4-ci-wire (per-type failure policy + artifacts), phase-5-triage (crash/regression runbook). Cross-refs _blocks/test-fuzz.md, test-property.md, test-load.md, test-e2e.md. Adds "complements" note to skills/test-gen/SKILL.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 20:46:02 +08:00 · 2026-04-21 20:46:02 +08:00 · 56ddccfddb
commit 56ddccfddb
parent 8b6ee37134
7 changed files with 513 additions and 0 deletions
--- a/skills/test-gen/SKILL.md
+++ b/skills/test-gen/SKILL.md
@ -7,6 +7,12 @@ arguments:
    required: true
 ---

+> **Complements `/test-matrix`.** `/test-gen` owns per-function unit tests
+> (happy / edge / error). `/test-matrix` owns project-wide testing strategy
+> (fuzz / property / load / e2e / mutation) and CI wiring. Use `/test-gen`
+> for a specific function, `/test-matrix` at project kickoff or when
+> coverage gaps span paradigms. See `skills/test-matrix/SKILL.md`.
+
 # Test Generation Workflow

 ## Step 1: Analyze Target
--- a/skills/test-matrix/SKILL.md
+++ b/skills/test-matrix/SKILL.md
@ -0,0 +1,104 @@
+---
+name: test-matrix
+description: Use when a project needs testing BEYOND unit tests — fuzzing, property-based, load, E2E, or mutation. Five-phase hub-and-spoke pipeline composes the right mix per language × critical path × CI target, scaffolds configs + corpus + fixtures, wires CI jobs, and defines the crash/regression triage workflow. Pure-click: every decision except intake is an AskUserQuestion.
+argument-hint: <free-text description of what needs testing and why>
+---
+
+# /test-matrix — Testing beyond unit tests (index)
+
+You are designing a **testing matrix** for a project that already has (or
+should have) unit-test coverage via `/test-gen`. This skill owns the
+orthogonal axes:
+
+- **Fuzzing** — input-space exploration at boundaries (parsers, deserializers, crypto)
+- **Property-based** — invariants verified over generated inputs (pure functions, data structures)
+- **Load** — SLO assertion under traffic (`k6`/`vegeta`/`oha`, baseline→profile→fix)
+- **E2E** — browser-driven critical journeys (Playwright, page objects, trace viewer)
+- **Mutation** — test-suite quality verification (mutmut / cargo-mutants / StrykerJS)
+
+**Not duplicated here:** happy-path / edge / error unit tests (`/test-gen`
+owns those). This skill links rather than re-implements.
+
+This `SKILL.md` is the INDEX. Each phase lives in its own file, executed in
+order. Never skip, never re-order.
+
+---
+
+## Pipeline overview (5 phases + final report)
+
+| Phase | File | Purpose | AskUserQuestion count |
+|---|---|---|---:|
+| 1 | [phase-1-intake.md](phase-1-intake.md) | Language(s), coverage baseline, critical paths, CI target | 1× (multi-part) |
+| 2 | [phase-2-matrix.md](phase-2-matrix.md) | Select test types × languages matrix | 1× multi-select |
+| 3 | [phase-3-scaffold.md](phase-3-scaffold.md) | Generate config + corpus + fixtures per selected cell | 1× per cell |
+| 4 | [phase-4-ci-wire.md](phase-4-ci-wire.md) | CI job per test type; artifacts; failure policy | 1× multi-select |
+| 5 | [phase-5-triage.md](phase-5-triage.md) | Crash + regression triage workflow | 1× |
+
+Minimum AskUserQuestion count across a full session: **5** (one per phase).
+Higher when Phase 3 expands per selected cell. This is the pure-click
+contract.
+
+---
+
+## Variables the pipeline produces
+
+| Name | Set in | Meaning |
+|---|---|---|
+| `LANGS` | Phase 1 | Languages in scope (Rust / Python / JS-TS / Go / Swift / Flutter — multi) |
+| `COVERAGE` | Phase 1 | Baseline unit-test coverage % (or "unknown") |
+| `CRITICAL` | Phase 1 | Critical paths: auth / payment / data-integrity / perf / untrusted-input |
+| `CI` | Phase 1 | github-actions / forgejo-actions / self-hosted / none |
+| `MATRIX` | Phase 2 | Set of (test-type × language) cells to scaffold |
+| `SCAFFOLDED` | Phase 3 | Files written per cell (paths + corpus seeds) |
+| `CI_JOBS` | Phase 4 | CI workflow entries added per cell |
+| `TRIAGE_DOC` | Phase 5 | Path to `docs/testing/triage.md` (or project-local equivalent) |
+
+---
+
+## Final report (emit after Phase 5)
+
+```
+=== TEST-MATRIX REPORT ===
+Languages:        <LANGS>
+Coverage (unit):  <COVERAGE>
+Critical paths:   <CRITICAL>
+Matrix cells:     <count> — <list (type × lang)>
+Files written:    <count> (configs + corpus + fixtures)
+CI jobs added:    <count> (<per-type failure policy>)
+Triage doc:       <TRIAGE_DOC>
+Next action:      Run <cmd> locally to verify the scaffold, then commit.
+```
+
+---
+
+## Rules (apply throughout)
+
+- **Pure-click contract.** Only the Phase 1 intake paragraph is free text.
+  Everything else is `AskUserQuestion`. Count in the final report.
+- **NO DOWNGRADE (RULE -1).** If a language × type cell has no good tool,
+  return 2-3 constructive paths, never "not supported".
+- **NO HALLUCINATION (RULE 0.4).** Every tool / library cited must exist
+  and be current. When in doubt, mark `[UNVERIFIED — verify release page]`
+  and surface in the report.
+- **Plan Mode First (RULE 0.5).** This skill IS the plan; no writes before
+  the corresponding phase's confirm click.
+- **Constructor Pattern (RULE ZERO).** Block files (`_blocks/test-*.md`)
+  stay ≤ 60 LOC. This SKILL.md ≤ 200 LOC; phase files ≤ 150 LOC each.
+- **Surgical Changes.** Writes only to:
+  - `<repo>/tests/`, `<repo>/fuzz/`, `<repo>/e2e/`, `<repo>/load/`
+  - `<repo>/.github/workflows/` or `<repo>/.forgejo/workflows/`
+  - `<repo>/docs/testing/triage.md`
+  - No writes to `_blocks/` here (that's `compose-solution`'s Phase 6).
+- **No duplication with `/test-gen`.** If the user really wants unit-test
+  generation, Phase 1 detects it and hands off immediately.
+
+---
+
+## References
+
+- [phase-1-intake.md](phase-1-intake.md) · [phase-2-matrix.md](phase-2-matrix.md) · [phase-3-scaffold.md](phase-3-scaffold.md) · [phase-4-ci-wire.md](phase-4-ci-wire.md) · [phase-5-triage.md](phase-5-triage.md)
+- `skills/test-gen/SKILL.md` — unit-test generation (happy / edge / error).
+  Phase 1 hands off there if intake reveals unit-test gap, not matrix gap.
+- `_blocks/test-fuzz.md` · `_blocks/test-property.md` · `_blocks/test-load.md` · `_blocks/test-e2e.md` — per-paradigm reference blocks, composable into manifests.
+- `_blocks/rule-test-first.md` — TDD / tests-with-code discipline (inherited).
+- `skills/compose-solution/SKILL.md` — if you need a NEW block (e.g. mutation-specific), hand off there (Phase 6 block-augment).
--- a/skills/test-matrix/phase-1-intake.md
+++ b/skills/test-matrix/phase-1-intake.md
@ -0,0 +1,92 @@
+# Phase 1 — Intake (language, coverage, critical paths, CI)
+
+One free-text paragraph + one AskUserQuestion multi-part batch.
+
+## 1a — Ask for the testing-gap description
+
+Emit a regular message (NOT AskUserQuestion):
+
+> Describe in one paragraph: what are you testing (project name / stack),
+> what gap is `/test-gen` not solving (fuzz? load? E2E? mutation? all?),
+> and what failure mode would be worst (prod crash? data loss? latency
+> regression? auth bypass?). Reply in one message.
+
+Store verbatim as `INTAKE`.
+
+If `INTAKE` mentions ONLY "unit tests" / "missing tests for function X"
+(unit-level gap, not matrix gap), emit:
+
+```
+DETECTION: this is a /test-gen task, not /test-matrix.
+Handing off to `skills/test-gen/SKILL.md`. Re-run /test-matrix later
+when fuzz / property / load / E2E / mutation coverage is needed.
+```
+
+…and STOP. Do not proceed.
+
+## 1b — Multi-part intake click (one AskUserQuestion call)
+
+```json
+{
+  "questions": [
+    {
+      "question": "Language(s) in scope?",
+      "header": "Languages",
+      "multiSelect": true,
+      "options": [
+        {"label": "Rust",        "description": "cargo-fuzz, proptest, cargo-mutants, oha"},
+        {"label": "Python",      "description": "hypothesis, atheris, mutmut, schemathesis"},
+        {"label": "JavaScript/TypeScript", "description": "fast-check, StrykerJS, Playwright"},
+        {"label": "Go",          "description": "built-in fuzz (go test -fuzz), gopter, vegeta"},
+        {"label": "Swift",       "description": "SwiftCheck, XCUITest — limited fuzz tooling"},
+        {"label": "Flutter/Dart", "description": "glados property, flutter integration_test"}
+      ]
+    },
+    {
+      "question": "Baseline unit-test coverage?",
+      "header": "Coverage",
+      "multiSelect": false,
+      "options": [
+        {"label": "High (≥ 80%)",        "description": "Matrix tests layer on top of solid unit base"},
+        {"label": "Medium (40-80%)",     "description": "Run /test-gen in parallel, don't skip unit gaps"},
+        {"label": "Low (< 40%)",         "description": "Strongly recommend /test-gen FIRST — fuzz+load on buggy code wastes CI"},
+        {"label": "Unknown — need to measure", "description": "Phase 3 will add a coverage job before scaffolding"}
+      ]
+    },
+    {
+      "question": "Critical paths (multi-select)?",
+      "header": "Critical",
+      "multiSelect": true,
+      "options": [
+        {"label": "Auth / session / crypto",         "description": "Fuzz + property mandatory on token parsers + signature verify"},
+        {"label": "Payment / money-in-motion",        "description": "E2E + property (invariants: no negative balance, idempotency) mandatory"},
+        {"label": "Data integrity (DB / serialization)", "description": "Property-based round-trips + migration E2E"},
+        {"label": "Performance-sensitive (< 100ms SLO)", "description": "Load tests with k6/oha mandatory; set SLO thresholds in CI"},
+        {"label": "Untrusted-input parsing",          "description": "Fuzz mandatory (cargo-fuzz / atheris / jsfuzz)"},
+        {"label": "User-facing UI flows",             "description": "E2E with Playwright on 5-15 critical journeys"}
+      ]
+    },
+    {
+      "question": "CI target?",
+      "header": "CI",
+      "multiSelect": false,
+      "options": [
+        {"label": "GitHub Actions",       "description": "workflow file under .github/workflows/"},
+        {"label": "Forgejo Actions",      "description": "workflow file under .forgejo/workflows/ (kit default — RULE 0.1 compatible)"},
+        {"label": "Self-hosted / custom", "description": "Emit portable YAML + shell scripts; wire manually"},
+        {"label": "None — local only",    "description": "Generate Makefile / justfile targets, no CI"}
+      ]
+    }
+  ]
+}
+```
+
+Store as `LANGS`, `COVERAGE`, `CRITICAL`, `CI`.
+
+## Verify-criterion
+
+- `INTAKE` is non-empty.
+- `LANGS` has ≥ 1 entry.
+- `CRITICAL` has ≥ 1 entry (zero-critical-path tasks are unit-test-only — redirect to /test-gen).
+- `CI` is exactly one value.
+- On failure, re-ask the failing input only. Never fall through.
--- a/skills/test-matrix/phase-2-matrix.md
+++ b/skills/test-matrix/phase-2-matrix.md
@ -0,0 +1,80 @@
+# Phase 2 — Select the test-type × language matrix
+
+Goal: turn `CRITICAL` + `LANGS` into the minimum set of `(test-type, language)`
+cells to scaffold. Fewer cells, done well, beats many cells half-wired.
+
+## 2a — Preview auto-recommendation
+
+Apply these rules and emit a preview table in chat (markdown):
+
+| Critical path | Recommended test types |
+|---|---|
+| Auth / crypto | fuzz + property |
+| Payment | property + e2e + mutation |
+| Data integrity | property + e2e |
+| Performance SLO | load |
+| Untrusted parsing | fuzz + property |
+| User-facing UI | e2e |
+
+Cross-product with `LANGS` → tentative `MATRIX_RECO`. Example output in chat:
+
+```
+Recommended cells (from CRITICAL × LANGS):
+  [1] fuzz × Rust       — rationale: untrusted-parsing + Rust → cargo-fuzz
+  [2] property × Rust   — rationale: data-integrity + Rust → proptest
+  [3] e2e × TS          — rationale: user-facing UI → Playwright
+  [4] load × Rust       — rationale: <100ms SLO → oha + k6
+  [5] mutation × Rust   — rationale: payment → cargo-mutants for suite quality
+```
+
+Number each cell for the multi-select.
+
+## 2b — Confirm / edit matrix (AskUserQuestion multi-select)
+
+```json
+{
+  "questions": [
+    {
+      "question": "Which cells to scaffold this session?",
+      "header": "Matrix",
+      "multiSelect": true,
+      "options": [
+        {"label": "[1] fuzz × <lang>",      "description": "Generate fuzz target + seed corpus + CI nightly job"},
+        {"label": "[2] property × <lang>",  "description": "Add property-test dependency + sample invariant test + regression cache"},
+        {"label": "[3] e2e × <lang>",       "description": "Scaffold Playwright project + 1 page-object example + trace viewer"},
+        {"label": "[4] load × <lang>",      "description": "k6/oha script + SLO thresholds + profile-loop runbook"},
+        {"label": "[5] mutation × <lang>",  "description": "mutmut/cargo-mutants/StrykerJS config + baseline mutation score"},
+        {"label": "Add a custom cell",       "description": "Free-text — e.g. contract tests, chaos tests, visual regression"},
+        {"label": "Skip a reco",             "description": "Drop one of the recommended cells — free-text reason"}
+      ]
+    }
+  ]
+}
+```
+
+Options are GENERATED dynamically — one per `MATRIX_RECO` cell PLUS the two
+catch-alls (`Add custom`, `Skip`). Substitute `<lang>` literally.
+
+On `Add a custom cell` → single free-text line → regenerate preview →
+re-ask. On `Skip a reco` → free-text reason (logged in final report) →
+regenerate → re-ask.
+
+## 2c — Budget check (soft cap)
+
+If the final `MATRIX` has > 6 cells, emit a WARNING message (NOT
+AskUserQuestion):
+
+> WARNING: <N> cells selected. Scaffolding + CI wiring for each is ~30 min
+> of human review per cell. Consider splitting into two sessions (critical
+> cells now, rest next week). Continue? Reply "yes" or re-run Phase 2.
+
+Store the final `MATRIX` as a list of `{type, lang, rationale}` objects.
+
+## Verify-criterion
+
+- `MATRIX` has ≥ 1 cell. Zero cells means nothing to do → stop with a
+  message pointing at `/test-gen`.
+- Every cell's `type` ∈ {fuzz, property, e2e, load, mutation, custom}.
+- Every cell's `lang` ∈ `LANGS` (no phantom language).
+- User explicitly confirmed the final matrix (not just auto-reco) — the
+  multi-select click counts as the confirmation.
--- a/skills/test-matrix/phase-3-scaffold.md
+++ b/skills/test-matrix/phase-3-scaffold.md
@ -0,0 +1,74 @@
+# Phase 3 — Scaffold config + corpus + fixtures per cell
+
+For each cell in `MATRIX`, generate the minimum-viable scaffold: one
+dependency declaration, one example test, one fixture / seed corpus, one
+local-run command. No over-scaffolding — just the "it runs" skeleton.
+
+## 3a — Per-cell confirmation (AskUserQuestion, loop over cells)
+
+For each cell, emit ONE AskUserQuestion:
+
+```json
+{
+  "questions": [
+    {
+      "question": "Scaffold plan for [<type> × <lang>] — proceed?",
+      "header": "<type>/<lang>",
+      "multiSelect": false,
+      "options": [
+        {"label": "Proceed with default scaffold",     "description": "Apply the default files listed below"},
+        {"label": "Minimal only (dep + 1 test)",        "description": "Skip CI + corpus; just prove the toolchain runs"},
+        {"label": "Edit one file",                      "description": "Reply with one free-text path — that file only gets custom content"},
+        {"label": "Skip this cell",                     "description": "Drop from MATRIX; next cell"}
+      ]
+    }
+  ]
+}
+```
+
+Preview the default scaffold BEFORE asking, so the user sees what "proceed"
+means. Example for `[fuzz × Rust]`:
+
+```
+Default scaffold for [fuzz × Rust]:
+  + fuzz/Cargo.toml           — cargo-fuzz manifest
+  + fuzz/fuzz_targets/parse.rs — example fuzz_target!(|data: &[u8]| { ... })
+  + fuzz/corpus/parse/seed_01  — one hand-picked valid input
+  + fuzz/README.md             — local-run commands
+Cite: _blocks/test-fuzz.md (corpus mgmt + triage + CI rules)
+```
+
+## 3b — Per-type default scaffolds
+
+| Cell | Files |
+|---|---|
+| **fuzz × Rust** | `fuzz/Cargo.toml` (cargo-fuzz), `fuzz/fuzz_targets/<target>.rs`, `fuzz/corpus/<target>/seed_01` |
+| **fuzz × Python** | `tests/fuzz/test_fuzz_<target>.py` (atheris OR hypothesis in fuzz mode), `tests/fuzz/corpus/` |
+| **fuzz × JS/TS** | `test/fuzz/<target>.fuzz.ts` (fast-check with `numRuns: 10_000`) |
+| **property × Rust** | `Cargo.toml` adds `proptest = "*"`, `tests/property_<name>.rs`, `.proptest-regressions` gitkeep |
+| **property × Python** | `tests/property/test_<name>.py` with `@given`, `.hypothesis/` gitignored except `examples/` |
+| **property × JS/TS** | `test/property/<name>.spec.ts` with `fc.assert(fc.property(...))` |
+| **load × any** | `load/k6/baseline.js` with SLO thresholds; `load/README.md` with baseline→profile→fix loop |
+| **e2e × any** | `e2e/playwright.config.ts`, `e2e/pages/login.page.ts`, `e2e/tests/login.spec.ts`, `e2e/README.md` |
+| **mutation × Rust** | `.cargo-mutants.toml`, first run command in `tests/mutation/README.md` |
+| **mutation × Python** | `mutmut` config in `setup.cfg` / `pyproject.toml`, runbook in `tests/mutation/README.md` |
+| **mutation × JS/TS** | `stryker.conf.mjs` with sane `timeoutMS`, `mutate` glob narrowed to critical paths |
+
+## 3c — Cite the block
+
+Every scaffold file's header comment references the relevant `_blocks/`
+file so the human reviewer can find the discipline rules:
+
+```rust
+// See _blocks/test-fuzz.md for corpus management + crash-triage rules.
+// This file is the minimum skeleton; real targets expand from here.
+```
+
+## Verify-criterion
+
+- For every `MATRIX` cell, user clicked `Proceed` / `Minimal` / explicit `Edit` / `Skip`.
+- At least one file is written per non-skipped cell.
+- `SCAFFOLDED` is a list of `{cell, files: [paths]}` entries.
+- No file overwrites an existing one without explicit confirmation
+  (a PreWrite check: if path exists, emit a second AskUserQuestion
+  "overwrite / skip / rename" before writing).
--- a/skills/test-matrix/phase-4-ci-wire.md
+++ b/skills/test-matrix/phase-4-ci-wire.md
@ -0,0 +1,87 @@
+# Phase 4 — CI wiring per cell (artifacts + failure policy)
+
+Each scaffolded cell gets exactly one CI job. Different paradigms have
+different failure-budget rules — wire them explicitly, never "all tests
+block merge by default".
+
+## 4a — Per-type failure policy (preview)
+
+Emit a table in chat showing the default policy per `MATRIX` cell:
+
+| Cell | Trigger | Duration | Failure policy |
+|---|---|---|---|
+| fuzz (short) | PR | 60 s per target | **block merge** on any crash |
+| fuzz (nightly) | cron | 1-4 h per target | **artifact + issue**, do not block PRs |
+| property | PR | ~30 s | **block merge** (failures are real bugs) |
+| load (smoke) | PR | 30-60 s | **block merge** if SLO thresholds fail |
+| load (full) | nightly / manual | 10-30 min | **artifact + dashboard**, do not block PRs |
+| e2e (critical) | PR | 2-5 min | **block merge** (retry×2 max) |
+| e2e (full) | nightly | 15-30 min | **artifact + trace**, do not block PRs |
+| mutation | weekly / manual | hours | **dashboard + report**, NEVER block PRs |
+
+Rationale written inline: fuzz and load have two lanes (fast smoke on PR,
+deep nightly). Mutation testing is too slow to block PRs. E2E uses retries
+but keeps the retry count honest (max 2).
+
+## 4b — Confirm CI jobs (AskUserQuestion multi-select)
+
+```json
+{
+  "questions": [
+    {
+      "question": "Which CI jobs to generate this session?",
+      "header": "CI Jobs",
+      "multiSelect": true,
+      "options": [
+        {"label": "fuzz-smoke (PR)",       "description": "60s per target per PR; blocks merge on crash"},
+        {"label": "fuzz-nightly (cron)",   "description": "1-4h deep fuzz; artifacts uploaded; non-blocking"},
+        {"label": "property (PR)",         "description": "~30s; blocks merge; PROPTEST_CASES=10000 in CI"},
+        {"label": "load-smoke (PR)",       "description": "30-60s; blocks merge if k6 SLO thresholds fail"},
+        {"label": "load-full (nightly)",   "description": "10-30m; uploads HTML report; non-blocking"},
+        {"label": "e2e-critical (PR)",     "description": "5-15 critical journeys; blocks merge; retry×2 max"},
+        {"label": "e2e-full (nightly)",    "description": "full suite; non-blocking; traces on failure"},
+        {"label": "mutation (weekly)",     "description": "full mutation run; emits HTML + score; never blocks PRs"},
+        {"label": "coverage gate",         "description": "add a coverage-diff gate so /test-gen output is measurable"}
+      ]
+    }
+  ]
+}
+```
+
+Options are GENERATED — only show the cell types actually present in
+`MATRIX`. Adding `mutation` to options only if at least one `mutation × _`
+cell was selected in Phase 2.
+
+## 4c — Write the workflow file(s)
+
+Based on `CI` from Phase 1:
+
+- **GitHub Actions** → `.github/workflows/test-matrix.yml` with jobs as
+  selected. One matrix-strategy job per paradigm (language matrix inside).
+- **Forgejo Actions** → `.forgejo/workflows/test-matrix.yml` (same schema
+  as GH Actions, compatible syntax). KeiSeiKit default (RULE 0.1).
+- **Self-hosted / custom** → emit portable YAML + a `Makefile` / `justfile`
+  with the same job commands so humans can wire into any CI.
+- **None — local only** → write only `Makefile` / `justfile` targets
+  (`make fuzz-smoke`, `make load-smoke`, etc.) and a `docs/testing/ci.md`
+  note explaining how to wire them into CI later.
+
+## 4d — Artifact discipline
+
+Every job uploads one artifact directory, never loose files:
+
+- `fuzz` → `fuzz/artifacts/` (crash inputs + minimized reproducers)
+- `load` → `load/reports/` (HTML, JSON summaries, Grafana links)
+- `e2e` → `test-results/` (traces, videos, screenshots — Playwright default)
+- `mutation` → `mutation-report/` (HTML + JSON)
+
+Retention: 30 days default; 90 days for nightly + weekly jobs. Never
+infinite — CI storage costs compound.
+
+## Verify-criterion
+
+- `CI_JOBS` has ≥ 1 entry (else redirect to local-only Makefile path).
+- Workflow file writes to the correct path per `CI` from Phase 1.
+- Every job declares explicit `timeout-minutes` (no unbounded runs).
+- Every job uploads artifacts on failure (not just on success).
+- No job `continue-on-error: true` for PR-blocking lanes.
--- a/skills/test-matrix/phase-5-triage.md
+++ b/skills/test-matrix/phase-5-triage.md
@ -0,0 +1,70 @@
+# Phase 5 — Crash / regression triage workflow
+
+Every matrix paradigm produces artifacts when it fails: fuzz crashes,
+shrunk property counterexamples, load-SLO violations, E2E traces,
+mutation survivors. Without a triage runbook, those artifacts rot.
+This phase writes `docs/testing/triage.md` so the next failure is
+actionable in ≤ 15 min.
+
+## 5a — Confirm runbook generation (AskUserQuestion)
+
+```json
+{
+  "questions": [
+    {
+      "question": "Write the triage runbook to docs/testing/triage.md?",
+      "header": "Triage",
+      "multiSelect": false,
+      "options": [
+        {"label": "Yes — full runbook",   "description": "Per-paradigm crash / regression flow + artifact paths + commit template"},
+        {"label": "Yes — minimal",        "description": "One-page checklist only; skip per-paradigm deep-dives"},
+        {"label": "Skip — team already has one", "description": "Finish without writing; final report notes the external link"}
+      ]
+    }
+  ]
+}
+```
+
+## 5b — Runbook template (full)
+
+For every selected paradigm in `MATRIX`, emit a section:
+
+```
+## <paradigm> failure triage
+
+1. Artifact: <fuzz/artifacts/ | .proptest-regressions | load/reports/ | test-results/ | mutation-report/>
+2. Reproduce locally: <exact command from phase-3 scaffold>
+3. Minimize: <tmin / shrink / trace-viewer / bisect>
+4. Write a failing regression test using the minimized input.
+5. Fix root cause (never the symptom — see RULE: No Patching).
+6. Re-run the matrix cell. Green = commit with `fix:` + reference artifact SHA.
+7. If flaky (not deterministic): quarantine with a ticket, never `retry: 5`.
+```
+
+Per-paradigm specifics are pulled from the citing `_blocks/test-*.md`:
+- fuzz → `cargo fuzz tmin` / atheris replay flow (block §crash-triage)
+- property → commit the shrunk counterexample as a normal unit test
+- load → re-baseline after each fix; one variable at a time
+- e2e → open `playwright show-trace`; never add `waitForTimeout`
+
+## 5c — Commit template
+
+The runbook ends with a ready-to-copy commit template:
+
+```
+fix(<paradigm>): <one-line symptom>
+
+Reproducer: <minimized artifact path + SHA>
+Root cause: <1-2 sentences>
+Regression test: <path to new permanent test>
+
+See docs/testing/triage.md §<paradigm> for the workflow used.
+```
+
+## Verify-criterion
+
+- `TRIAGE_DOC` is set to `docs/testing/triage.md` (or skipped with reason).
+- Every `MATRIX` paradigm has a section in the runbook.
+- Every section lists artifact path + reproduce command + regression-test
+  requirement + root-cause discipline + flake policy.
+- Commit template present at end of doc.