- Add insta + tempfile to _assembler/Cargo.toml [dev-dependencies]. - Create tests/common/mod.rs with helpers: seed_tempdir (copies fixtures into an isolated AGENT_ROOT), run_assemble (invokes the built binary via std::process::Command), and assemble_one (end-to-end single-manifest helper). - Seed tests/fixtures/ with the 4 manifests covered by the golden snapshots (code-implementer, researcher, cost-guardian, patent-compliance) and the 7 blocks they reference (baseline, evidence-grading, memory-protocol, rule-pre-dev-gate, rule-test-first, rule-error-budget, rule-double-audit). Binary-only crate (no lib target), so integration tests invoke the assemble binary in-process instead of calling internal functions. This exercises the full main.rs I/O + validator + assembler pipeline end-to-end, which is exactly what the determinism claim covers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
693 B
693 B
ERROR BUDGET — 3-Level Escalation
Counter: each FAILED attempt on the SAME problem = +1. Success = reset.
- Level 1 (attempt 2 failed): STOP. Rollback (
git stash). Re-read plan. Formulate ALTERNATIVE. Explain to user before continuing. - Level 2 (attempt 3 failed): STOP. Approach exhausted. Run focused research. Audit affected module. Check
wrong-paths.md. New plan with evidence grades → user approval → THEN code. - Level 3 (still stuck): ESCALATE. Tell user "more complex than initially thought". Suggest workaround / simplify scope / defer / redesign.
Prohibited: third attempt with same approach; skipping Level 1; silent research without notifying user.