Phase 1 of substrate-unified-registry: move all references to user
home memory/rules out of plain strings and into content-addressable
path atoms. Public artefacts now contain opaque `{path::NAME}/file.md`
references; the actual home prefix lives only in the path-atom file's
frontmatter, registered in the local kei-registry.
NEW path atoms (`_blocks/path-*.md`):
- `path-user-memory.md` → template `~/.claude/memory`
- `path-user-rules.md` → template `~/.claude/rules`
Both files use frontmatter `type: atom, kind: path, template: ..., expand_at: render`.
BlockMdScanner auto-registers them; DNA index shows them under their
unprefixed names (`user-memory`, `user-rules`) for human lookup, while
the body sha8 makes them content-addressable.
Resolver (`_assembler/src/registry_client.rs`):
- `is_path_atom(conn, name)` — checks DB by name + filename convention
(`_blocks/path-<name>.md`) + frontmatter `kind: path`. Defensive:
filename + frontmatter must BOTH agree.
- `frontmatter_has_kind_path(body)` — minimal YAML parser. Tolerates
CRLF, quoted values, rejects substring matches (`pathological` ≠ `path`).
- 5 unit tests cover positive + 4 negative cases.
Resolver wire-up (`_assembler/src/assembler.rs:147 write_references`):
- For each `references.extra` entry starting with `path:NAME/...`:
- Lookup `NAME` via `is_path_atom`.
- On success: emit `{path::NAME}/<suffix>` — opaque, kit-resolvable.
- On miss: stderr warn + passthrough. Never fatal.
- Non-`path:` refs pass through unchanged. Backward compatible.
- 2 unit tests cover passthrough paths.
Manifest migration (38 manifests touched):
- `~/.claude/rules/<file>` → `path:user-rules/<file>`
- `~/.claude/memory/<file>` → `path:user-memory/<file>`
- 96 references migrated; 1 prose-style reference in security-auditor
left as plain text (lives inside a domain_in description, not in
references.extra — out of scope for this resolver).
Regenerated 38 `_generated/*.md` + 1 new `frontend-validator.md`.
Regenerated `docs/DNA-INDEX.md` (now includes 2 path-atoms by name).
Verification (cited):
- `git ls-files | grep denisparfionovich` → 0 hits outside allowlist
(NOTICE/README byline + `.github/workflows/leak-check.yml` detection
rule).
- `_generated/` contains 99 occurrences of `{path::user-...}/`.
- assembler tests: 29 passed (5 new). kei-registry tests: 10 passed
(8 short_path from earlier commit + 2 unrelated).
- assembler resolver verified end-to-end: ml-implementer.md line
479-485 shows `{path::user-rules}/ml-protocol.md` etc.
What this does NOT do (deferred):
- No registry-DB schema change. Path atoms ride existing Atom block-
type via convention, not via new `BlockType::PathAtom` variant.
- No git-branch tracking (Phase 2 of plan).
- No `kei-registry status` cross-cutting CLI (Phase 3 of plan).
- No path-atom orphan detection CLI (Phase 4).
The path:user-memory and path:user-rules cover 100% of the username-
leak surface from the current manifest set; future categories
(kit-root, registry-db, sync-repo, secrets-env, project-root) can
land additively without architectural changes.
=== STATUS-TRUTH MARKER ===
shipped: functional
stubs: 0
cargo-check: PASS
behaviour-verified: yes
follow-up-required:
- Phase 2 (git-branch tracker hook)
- Phase 3 (kei-registry status subcommand)
- Phase 4 (orphan detection CLI)
- Sync user-side install: ~/.claude/agents/_manifests/ still has
pre-migration absolute paths; will pick up new format on next
`install.sh --add` (out of scope for this commit).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
193 lines
6.9 KiB
Rust
193 lines
6.9 KiB
Rust
//! Thin read-only client over `~/.claude/registry.sqlite`.
|
|
//!
|
|
//! Fetches rule-fragment content by logical name (`rule::section`).
|
|
//! The registry stores the real filesystem path; this module reads that path.
|
|
//!
|
|
//! Constructor Pattern: one responsibility — lookup + read fragment body.
|
|
//! No writes. No schema migration. Opens DB read-only.
|
|
|
|
use rusqlite::{Connection, OpenFlags};
|
|
use std::path::{Path, PathBuf};
|
|
|
|
/// Open the registry at `db_path` in read-only mode.
|
|
pub fn open_read_only(db_path: &Path) -> Result<Connection, String> {
|
|
Connection::open_with_flags(db_path, OpenFlags::SQLITE_OPEN_READ_ONLY)
|
|
.map_err(|e| format!("open registry {}: {e}", db_path.display()))
|
|
}
|
|
|
|
/// Default path: `$KEI_REGISTRY_DB` (if set) or `~/.claude/registry.sqlite`.
|
|
pub fn default_db_path() -> PathBuf {
|
|
if let Some(v) = std::env::var_os("KEI_REGISTRY_DB") {
|
|
return PathBuf::from(v);
|
|
}
|
|
let home = std::env::var_os("HOME").unwrap_or_default();
|
|
PathBuf::from(home).join(".claude/registry.sqlite")
|
|
}
|
|
|
|
/// Look up a rule fragment by `name` (e.g. `"karpathy-behavioral::1-think-before-coding"`).
|
|
///
|
|
/// Returns:
|
|
/// - `Ok(Some(body))` — fragment found and file readable.
|
|
/// - `Ok(None)` — name not in registry, or registry path does not exist on disk.
|
|
/// Caller should warn-and-skip.
|
|
/// - `Err(msg)` — DB query failure (not a missing-path issue). Propagate.
|
|
pub fn find_rule(conn: &Connection, name: &str) -> Result<Option<String>, String> {
|
|
let path = match query_path(conn, name)? {
|
|
Some(p) => p,
|
|
None => return Ok(None),
|
|
};
|
|
read_fragment_body(name, &path)
|
|
}
|
|
|
|
/// Query the `path` column for the active row with `name` and `block_type='rule'`.
|
|
fn query_path(conn: &Connection, name: &str) -> Result<Option<String>, String> {
|
|
let mut stmt = conn
|
|
.prepare(
|
|
"SELECT path FROM blocks \
|
|
WHERE name = ?1 AND block_type = 'rule' AND superseded_by IS NULL \
|
|
LIMIT 1",
|
|
)
|
|
.map_err(|e| format!("prepare query for {name}: {e}"))?;
|
|
let row: Option<String> = stmt
|
|
.query_row(rusqlite::params![name], |r| r.get(0))
|
|
.optional()
|
|
.map_err(|e| format!("query registry for {name}: {e}"))?;
|
|
Ok(row)
|
|
}
|
|
|
|
/// Read the fragment body from `path`. Returns `Ok(None)` when the file is absent.
|
|
fn read_fragment_body(name: &str, path: &str) -> Result<Option<String>, String> {
|
|
match std::fs::read_to_string(path) {
|
|
Ok(body) => Ok(Some(body)),
|
|
Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
|
|
eprintln!(
|
|
"warn [assembler]: registry fragment for '{name}' has path '{path}' but file is missing — skipping. \
|
|
Run `kei-decompose decompose-rules --rebuild-fragments` to restore."
|
|
);
|
|
Ok(None)
|
|
}
|
|
Err(e) => Err(format!("read fragment for {name} at {path}: {e}")),
|
|
}
|
|
}
|
|
|
|
trait OptionalExt<T>: Sized {
|
|
fn optional(self) -> rusqlite::Result<Option<T>>;
|
|
}
|
|
|
|
impl<T> OptionalExt<T> for rusqlite::Result<T> {
|
|
fn optional(self) -> rusqlite::Result<Option<T>> {
|
|
match self {
|
|
Ok(v) => Ok(Some(v)),
|
|
Err(rusqlite::Error::QueryReturnedNoRows) => Ok(None),
|
|
Err(e) => Err(e),
|
|
}
|
|
}
|
|
}
|
|
|
|
/// Check if `name` is a registered path-atom.
|
|
///
|
|
/// Convention: a path-atom is an atom whose source file is
|
|
/// `_blocks/path-<name>.md` and whose YAML frontmatter declares
|
|
/// `kind: path`. The DB stores only the file path (not body), so this
|
|
/// function uses the filename convention as a fast first check, then
|
|
/// reads the file and parses the frontmatter to confirm `kind: path`.
|
|
///
|
|
/// Returns:
|
|
/// - `Ok(true)` — atom registered under `name`, file exists, frontmatter
|
|
/// declares `kind: path`. Caller may emit an opaque resolved reference.
|
|
/// - `Ok(false)` — atom not found, or found but not a path-atom. Caller
|
|
/// should pass the original reference through unchanged (with optional
|
|
/// warn-and-skip in caller).
|
|
/// - `Err(msg)` — DB query failure. Propagate.
|
|
pub fn is_path_atom(conn: &Connection, name: &str) -> Result<bool, String> {
|
|
let mut stmt = conn
|
|
.prepare(
|
|
"SELECT path FROM blocks \
|
|
WHERE name = ?1 AND block_type = 'atom' AND superseded_by IS NULL \
|
|
LIMIT 1",
|
|
)
|
|
.map_err(|e| format!("prepare path-atom query for {name}: {e}"))?;
|
|
let path: Option<String> = stmt
|
|
.query_row(rusqlite::params![name], |r| r.get(0))
|
|
.optional()
|
|
.map_err(|e| format!("query path-atom {name}: {e}"))?;
|
|
let Some(p) = path else { return Ok(false) };
|
|
// Filename convention check: `_blocks/path-<name>.md`. Cheap O(1) string
|
|
// contains, avoids the file read on the common non-path-atom case.
|
|
let expected_suffix = format!("/_blocks/path-{name}.md");
|
|
if !p.ends_with(&expected_suffix) {
|
|
return Ok(false);
|
|
}
|
|
// Read frontmatter to confirm `kind: path`. Defensive — convention is
|
|
// not authoritative on its own; explicit declaration is.
|
|
let body = match std::fs::read_to_string(&p) {
|
|
Ok(b) => b,
|
|
Err(_) => return Ok(false),
|
|
};
|
|
Ok(frontmatter_has_kind_path(&body))
|
|
}
|
|
|
|
/// Return true if `body` starts with a YAML frontmatter block (`---\n...---\n`)
|
|
/// containing a line whose key is `kind` and value is `path`. Tolerates
|
|
/// `---\r\n`, surrounding whitespace, and YAML quoting.
|
|
fn frontmatter_has_kind_path(body: &str) -> bool {
|
|
let stripped = match body
|
|
.strip_prefix("---\n")
|
|
.or_else(|| body.strip_prefix("---\r\n"))
|
|
{
|
|
Some(s) => s,
|
|
None => return false,
|
|
};
|
|
let end = match stripped
|
|
.find("\n---\n")
|
|
.or_else(|| stripped.find("\r\n---\r\n"))
|
|
{
|
|
Some(i) => i,
|
|
None => return false,
|
|
};
|
|
let frontmatter = &stripped[..end];
|
|
for line in frontmatter.lines() {
|
|
let line = line.trim();
|
|
if let Some(rest) = line.strip_prefix("kind:") {
|
|
let val = rest.trim().trim_matches(&['\'', '"'][..]);
|
|
return val == "path";
|
|
}
|
|
}
|
|
false
|
|
}
|
|
|
|
#[cfg(test)]
|
|
mod tests {
|
|
use super::frontmatter_has_kind_path;
|
|
|
|
#[test]
|
|
fn detects_kind_path_in_frontmatter() {
|
|
let body = "---\ntype: atom\nkind: path\nname: foo\n---\n\n# body\n";
|
|
assert!(frontmatter_has_kind_path(body));
|
|
}
|
|
|
|
#[test]
|
|
fn rejects_kind_other() {
|
|
let body = "---\ntype: atom\nkind: other\n---\n";
|
|
assert!(!frontmatter_has_kind_path(body));
|
|
}
|
|
|
|
#[test]
|
|
fn rejects_no_frontmatter() {
|
|
let body = "# just markdown\n";
|
|
assert!(!frontmatter_has_kind_path(body));
|
|
}
|
|
|
|
#[test]
|
|
fn tolerates_quoted_value() {
|
|
let body = "---\nkind: \"path\"\n---\n";
|
|
assert!(frontmatter_has_kind_path(body));
|
|
}
|
|
|
|
#[test]
|
|
fn rejects_kind_path_substring() {
|
|
// `kind: pathological` must NOT match `kind: path`.
|
|
let body = "---\nkind: pathological\n---\n";
|
|
assert!(!frontmatter_has_kind_path(body));
|
|
}
|
|
}
|