fix(substrate): amendment A-1 (input/output required all kinds) + integration test + jsonschema 0.18 relative-$id bug

Three post-E1/E2/E3-merge items:

1. Schema amendment A-1 (architect P0-a, non-breaking clarification):
   input.schema and output.schema are REQUIRED for all atom kinds. The
   shared kei-atom-discovery parses them as Option<PathBuf> only to allow
   tolerant skip-on-missing (stderr warn), not to permit absent schemas.
   Resolves Stream C / Stream D enforcement asymmetry documented in
   critic finding #6.

2. Cross-stream integration test (architect P0-b): tests/substrate_integration.sh
   builds release binaries, scaffolds a test atom corpus, runs
   schema-lint + list-atoms + atoms-discover + invoke; asserts all four
   streams agree on the same atom corpus and exit codes honour the
   locked §Runtime contract. Previously missing — only manual smoke
   checks existed.

3. Fix regression introduced by E1's jsonschema 0.18 upgrade:
   "relative URL without a base" on compile when schema declared a
   relative $id like "kei-task/atoms/schemas/create-input.json".
   validate.rs now synthesises an absolute file:// $id from the
   canonicalised schema path before compile. Internal $refs still
   resolve relative to the schema file; LocalFileResolver still confines
   to the schema's parent dir. Integration test catches this.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Parfii-bot 2026-04-23 00:56:27 +08:00
parent ca0635e0fa
commit f7e4725573
3 changed files with 169 additions and 1 deletions

View file

@ -39,8 +39,16 @@ pub fn validate_output(schema_path: &Path, output: &Value) -> Result<(), Validat
fn validate_value(schema_path: &Path, value: &Value) -> Result<(), ValidationError> {
let schema_text = std::fs::read_to_string(schema_path)
.map_err(|e| ValidationError(format!("read {}: {e}", schema_path.display())))?;
let schema_json: Value = serde_json::from_str(&schema_text)
let mut schema_json: Value = serde_json::from_str(&schema_text)
.map_err(|e| ValidationError(format!("parse {}: {e}", schema_path.display())))?;
// jsonschema 0.18 requires an absolute base URI for the schema. Our atom
// schemas typically declare a relative `$id` like
// "kei-task/atoms/schemas/create-input.json" which fails compile with
// "relative URL without a base". Inject a synthetic `file://` $id keyed
// to the actual schema path so any internal `$ref` still resolves
// relative to the file (and our LocalFileResolver confines to the
// schema's parent dir for safety).
inject_absolute_id(&mut schema_json, schema_path);
let root = schema_path.parent().unwrap_or(schema_path).to_path_buf();
let compiled = JSONSchema::options()
.with_draft(jsonschema::Draft::Draft7)
@ -54,6 +62,25 @@ fn validate_value(schema_path: &Path, value: &Value) -> Result<(), ValidationErr
Ok(())
}
fn inject_absolute_id(schema: &mut Value, schema_path: &Path) {
let obj = match schema.as_object_mut() {
Some(o) => o,
None => return,
};
let needs_replace = match obj.get("$id").and_then(|v| v.as_str()) {
None => true, // missing
Some(s) => Url::parse(s).is_err(), // non-absolute
};
if !needs_replace {
return;
}
if let Ok(canon) = schema_path.canonicalize() {
if let Ok(url) = Url::from_file_path(&canon) {
obj.insert("$id".to_string(), Value::String(url.to_string()));
}
}
}
/// `$ref` resolver that rejects every scheme except `file://`, AND rejects
/// any path that is not inside `root` (canonicalised).
#[derive(Debug)]

View file

@ -391,3 +391,11 @@ Non-breaking additions (new optional fields, new atom kinds, new side-effect dom
| 6 | Error model per-atom vs shared registry | **Per-atom** | Simpler to start. Registry can be added later non-breakingly. |
**Locked values:** all of the above. Breaking changes to any of these during 6-week parallel window require explicit user revocation + all-streams sync + ledger row.
## Amendments — non-breaking clarifications
| # | Date | Clarification | Reason |
|---|---|---|---|
| A-1 | 2026-04-23 | **`input.schema` and `output.schema` are REQUIRED for all atom kinds** (`command` / `query` / `stream` / `transform`). An atom with no inputs should declare `input.schema` pointing to a JSON Schema with `{"type": "object", "properties": {}, "additionalProperties": false}` — i.e., "empty object". Similarly for no-output. The runtime + graph lint BOTH enforce presence of the schema ref; shared `kei-atom-discovery` parses them as `Option<PathBuf>` only to allow tolerant skip-on-missing (with stderr warning) rather than aborting the whole scan on one bad atom. | Architect P0-a (post-audit 2026-04-23) — Stream C parsed input/output Optional, Stream D required. Asymmetric enforcement → "sage sees atom, runtime skips" drift. Both streams now agree: Optional at parse layer, required at lint layer. |
These amendments document interpretations consistent with the locked schema — no frontmatter-shape change, no wire-format change, no stream refactor required.

133
tests/substrate_integration.sh Executable file
View file

@ -0,0 +1,133 @@
#!/usr/bin/env bash
# substrate_integration.sh — cross-stream integration smoke test
#
# Architect P0-b (audit wave 2026-04-23): each stream (kei-forge / kei-task
# atoms / kei-sage / kei-runtime) has its own smoke tests, but no single
# test exercised the cross-stream composition. This script is that test.
#
# The check: build release binaries, generate a fresh atom via new-atom.sh,
# then verify that kei-runtime + kei-sage BOTH discover it identically and
# that kei-runtime schema-lint passes on it.
#
# Exit 0 = substrate v1 contract holds end-to-end
# Exit 1 = any step failed — see stderr for the offending stage
set -euo pipefail
ROOT="$(cd "$(dirname "$0")/.." && pwd)"
cd "$ROOT"
TMPROOT="$(mktemp -d)"
trap 'rm -rf "$TMPROOT"' EXIT
fail() { echo "SUBSTRATE-INTEGRATION FAIL: $*" >&2; exit 1; }
echo "==> Building release binaries (kei-runtime, kei-sage)…"
cd _primitives/_rust
cargo build --release -p kei-runtime -p kei-sage >/dev/null 2>&1 \
|| fail "cargo build failed"
RT="$(pwd)/target/release/kei-runtime"
SAGE="$(pwd)/target/release/kei-sage"
cd "$ROOT"
echo "==> Scaffolding a fresh atom (kei-task::create) via new-atom.sh for isolated test corpus…"
CORPUS="$TMPROOT/corpus/kei-task"
mkdir -p "$CORPUS"/{atoms/schemas,src/atoms,tests}
# Minimal hand-crafted atom mirroring Stream B's create atom shape —
# covers all REQUIRED frontmatter fields so schema-lint passes.
cat > "$CORPUS/atoms/create.md" <<'EOF'
---
atom: kei-task::create
kind: command
version: "0.22.3"
input:
schema: schemas/create-input.json
required: [title]
example: { title: "x" }
output:
schema: schemas/create-output.json
example: { id: 1 }
errors:
- code: DuplicateTitle
http_analog: 409
side_effects:
- { op: write, domain: kei-task-db }
idempotent: false
timeout_ms: 5000
stability: stable
keywords: [integration-test]
related: []
---
# kei-task::create
Integration-test atom. See substrate_integration.sh.
EOF
cat > "$CORPUS/atoms/schemas/create-input.json" <<'EOF'
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "kei-task/atoms/schemas/create-input.json",
"title": "kei-task::create input",
"type": "object",
"required": ["title"],
"properties": {
"title": { "type": "string", "minLength": 1 }
},
"additionalProperties": false,
"examples": [{"title": "x"}]
}
EOF
cat > "$CORPUS/atoms/schemas/create-output.json" <<'EOF'
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "kei-task/atoms/schemas/create-output.json",
"title": "kei-task::create output",
"type": "object",
"properties": { "id": { "type": "integer" } },
"additionalProperties": false,
"examples": [{"id": 1}]
}
EOF
echo "==> kei-runtime schema-lint…"
"$RT" schema-lint --root "$TMPROOT/corpus" \
| grep -q "^PASS" \
|| fail "schema-lint did not report PASS"
echo "==> kei-runtime list-atoms…"
LIST="$("$RT" list-atoms --root "$TMPROOT/corpus")"
echo "$LIST" | grep -q "kei-task::create" \
|| fail "kei-runtime list-atoms did not see kei-task::create"
echo "==> kei-sage atoms-discover…"
DISCOVER="$("$SAGE" atoms-discover --root "$TMPROOT/corpus")"
echo "$DISCOVER" | grep -q "kei-task::create" \
|| fail "kei-sage atoms-discover did not see kei-task::create"
echo "==> Cross-stream ID agreement…"
RT_IDS="$(echo "$LIST" | awk '{print $1}' | sort)"
SAGE_IDS="$(echo "$DISCOVER" | awk 'NR>1 && $1 != "" {print $1}' | sort)"
[ "$RT_IDS" = "$SAGE_IDS" ] \
|| fail "runtime and sage disagree on atom IDs:\n runtime: $RT_IDS\n sage: $SAGE_IDS"
echo "==> kei-runtime invoke (expects NotImplemented → exit 64)…"
set +e
"$RT" invoke --root "$TMPROOT/corpus" kei-task::create --input '{"title":"x"}' >/dev/null 2>&1
RC=$?
set -e
[ "$RC" -eq 64 ] \
|| fail "invoke should exit 64 (NotImplemented), got $RC"
echo "==> kei-runtime invoke with bad input (expects InputInvalid → exit 2)…"
set +e
"$RT" invoke --root "$TMPROOT/corpus" kei-task::create --input '{}' >/dev/null 2>&1
RC=$?
set -e
[ "$RC" -eq 2 ] \
|| fail "invoke with missing required field should exit 2, got $RC"
echo ""
echo "✓ SUBSTRATE-INTEGRATION PASS — all 4 streams agree on schema, runtime + sage see same atoms, exit codes per locked §Runtime contract"