KeiSeiKit-1.0/_primitives/_rust/kei-registry/src/secrets.rs
Parfii-bot af46684330 feat(secrets+catalog): orphan-detector for env vars + image/video/voice models
Two parallel agents (both Sonnet 4.6 via the just-activated tier system)
extended the substrate-unified-registry. First end-to-end proof that the
Phase 4 router refactor saves money: no Opus spawns this round.

PART 1 — `kei-registry secrets` subcommand (Agent A — code-implementer)

Reads env-var NAMES from `~/.claude/secrets/.env` (RULE 0.8 SSoT) and
per-project `secrets/*.env`, greps the kit tree for usages, reports
orphans (defined but unreferenced). Live run on this kit found 26 keys,
11 ORPHAN — actionable cleanup candidates incl. GitHub OAuth client
creds, Godaddy keys, KeiGit admin creds, KEI_MEMORY_TOKEN.

Files:
- `_primitives/_rust/kei-registry/src/secrets.rs` (152 LOC) — pure
  read-side cube. SecretsReport + KeyRow types, env-file parser
  (KEY=value lines, validates `^[A-Z][A-Z0-9_]*$`), walkdir-based
  scanner with skips (target/ node_modules/ .git/ _generated/),
  word-boundary regex per key. ASCII + JSON render.
- `_primitives/_rust/kei-registry/src/secrets_tests.rs` (125 LOC) —
  5 unit tests covering env parse, scan correctness, word-boundary
  regression (`MY_KEY` ≠ `MY_KEY_EXTRA`), JSON roundtrip, ORPHAN marker.
- `_primitives/_rust/kei-registry/src/secrets_handler.rs` (58 LOC) —
  CLI dispatch handler.
- `cli.rs`, `handlers.rs`, `lib.rs` extended with Secrets variant.

Resolves the asymmetry called out in the design discussion: paths got
atomization (commit 3422bdc), keys get a query-layer instead. Reason:
env-var NAMES are already public and stable; opaque atom-DNA over them
adds zero security and full overhead. Orphan detection is the unique
value, and a 30-LOC subcommand delivers it without a per-key atom file.

PART 2 — kei-model catalog extension (Agent B — fal-ai-runner)

Adds 10 generation-model entries with VERIFIED pricing per RULE 0.4:
- google: gemini-3-1-flash-image, gemini-3-pro-image
- fal.ai: flux-2-pro, flux-pro-1-1, kling-o3, veo-3, ideogram-v3, recraft-v3
- elevenlabs: elevenlabs-v3, elevenlabs-multilingual-v2

Pricing sourced from each provider's public pricing page (URLs cited
per row in `notes` + `source_url` fields); 8/10 verified, 2 marked
needs-verification (gemini-3-pro-image price not found on public page).

Schema additions to `_primitives/_rust/kei-model/src/model.rs` to
support the new entries without `provider = "local"` placeholder:
- Provider enum + 3 variants: Google, Fal, Elevenlabs (with as_str
  + parse impls).
- Capability enum + 9 variants: image-gen, text-to-image, image-edit,
  video-gen, text-to-video, image-to-video, voice-gen, text-to-speech,
  voice-clone (with serde rename + as_str + parse).

Pricing struct unchanged: per-image / per-second / per-1k-chars unit
costs ride existing `output_per_mtok_micro` field with the unit
documented in `notes` (e.g. "Per-image cost. 1 unit = 1 image."). A
proper Pricing.unit field is a follow-up.

Files:
- `_primitives/_rust/kei-model/src/model.rs` (+24 LOC enum extensions)
- `_primitives/_rust/kei-model/data/models.toml` (+216 LOC, 471 total)

`kei-model list` returns the full 21-model catalog incl. new providers.

Tests:
- kei-registry: 25 passed (existing + 5 secrets tests + 10 status)
- kei-model: 0 (no unit tests in crate, parser smoke via list)
- agent-assembler: 29 passed (no regressions)

Verification (cited):
- `./target/release/kei-registry secrets --env-file ~/.claude/secrets/.env`
  emits real report 26/11 orphan.
- `./target/release/kei-model list` parses all 21 entries cleanly.
- `cargo build --release --workspace` clean.

What this does NOT do (deferred):
- Pricing.unit field (per-mtok / per-image / per-second / per-1k-chars
  discriminator) — needs Rust struct refactor + cost-estimator update.
- `secrets` skip-list extension (worktrees, _ts_packages/node_modules
  duplicate counts) — minor noise.
- gemini-3-pro-image pricing (no public page; vendor-specific quote
  needed).

=== STATUS-TRUTH MARKER ===
shipped: functional
stubs: 0
cargo-check: PASS
behaviour-verified: yes
follow-up-required:
  - Pricing.unit field for cost-estimator correctness on gen models
  - secrets scan: skip .claude/worktrees/ to avoid duplicate counts
  - gemini-3-pro-image price verification

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 00:06:16 +08:00

152 lines
5.8 KiB
Rust

//! Secret-reference orphan detector.
//!
//! Reads env-var NAMES from `.env` files (never values), greps the kit
//! tree for usages, returns a `SecretsReport` with per-key usage counts
//! and orphan list. Constructor Pattern: pure read-side cube.
use anyhow::Result;
use regex::Regex;
use serde::{Deserialize, Serialize};
use std::collections::BTreeMap;
use std::path::{Path, PathBuf};
use walkdir::WalkDir;
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct SecretsReport {
pub keys: Vec<KeyRow>,
pub scanned_files: u64,
pub env_files: Vec<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct KeyRow {
pub name: String,
pub source_env_file: String,
pub usage_count: u64,
/// Top 5 files where the key appears.
pub usage_files: Vec<String>,
pub orphan: bool,
}
const SKIP_DIRS: &[&str] = &["target", "node_modules", ".git", "_generated"];
const TEXT_EXTS: &[&str] = &["rs", "toml", "md", "sh", "py", "ts", "js", "yml", "yaml", "json"];
pub(crate) fn is_valid_key(s: &str) -> bool {
let mut chars = s.chars();
match chars.next() {
Some(c) if c.is_ascii_uppercase() => {
chars.all(|c| c.is_ascii_uppercase() || c.is_ascii_digit() || c == '_')
}
_ => false,
}
}
pub(crate) fn parse_env_file(path: &Path) -> Result<Vec<String>> {
let content = std::fs::read_to_string(path)?;
let mut keys = Vec::new();
for line in content.lines() {
let trimmed = line.trim();
if trimmed.is_empty() || trimmed.starts_with('#') { continue; }
let Some(idx) = trimmed.find('=') else { continue; };
let key = trimmed[..idx].trim();
if is_valid_key(key) { keys.push(key.to_string()); }
}
Ok(keys)
}
fn is_text_file(path: &Path) -> bool {
path.extension().and_then(|e| e.to_str()).map_or(false, |ext| TEXT_EXTS.contains(&ext))
}
fn word_re(key: &str) -> Result<Regex> {
Ok(Regex::new(&format!(r"\b{}\b", regex::escape(key)))?)
}
/// Scan `scan_root`, returning scanned_files count and per-key (count, files) map.
pub(crate) fn scan_usages(
keys: &[String],
scan_root: &Path,
) -> Result<(u64, BTreeMap<String, (u64, Vec<String>)>)> {
let patterns: Vec<(String, Regex)> = keys
.iter()
.map(|k| Ok((k.clone(), word_re(k)?)))
.collect::<Result<_>>()?;
let mut counts: BTreeMap<String, (u64, Vec<String>)> = BTreeMap::new();
for k in keys { counts.insert(k.clone(), (0, Vec::new())); }
let mut scanned = 0u64;
for entry in WalkDir::new(scan_root).follow_links(false)
.into_iter()
.filter_entry(|e| !SKIP_DIRS.contains(&e.file_name().to_string_lossy().as_ref()))
.flatten()
{
if !entry.file_type().is_file() || !is_text_file(entry.path()) { continue; }
let Ok(content) = std::fs::read_to_string(entry.path()) else { continue; };
scanned += 1;
let rel = entry.path().strip_prefix(scan_root).unwrap_or(entry.path())
.to_string_lossy().to_string();
for (key, re) in &patterns {
if re.is_match(&content) {
let e = counts.get_mut(key).expect("key present");
e.0 += 1;
if e.1.len() < 5 { e.1.push(rel.clone()); }
}
}
}
Ok((scanned, counts))
}
/// Build a `SecretsReport`. Pure: no side effects beyond file reads.
pub fn compute_secrets_report(env_paths: &[PathBuf], scan_root: &Path) -> Result<SecretsReport> {
let mut all_keys: Vec<(String, String)> = Vec::new();
let mut env_file_labels: Vec<String> = Vec::new();
for ep in env_paths {
let label = ep.to_string_lossy().to_string();
env_file_labels.push(label.clone());
for k in parse_env_file(ep).unwrap_or_default() { all_keys.push((k, label.clone())); }
}
let unique_keys: Vec<String> = {
let mut seen = std::collections::HashSet::new();
all_keys.iter().filter(|(k, _)| seen.insert(k.clone())).map(|(k, _)| k.clone()).collect()
};
let (scanned_files, counts) = scan_usages(&unique_keys, scan_root)?;
let mut rows: Vec<KeyRow> = all_keys.into_iter().map(|(name, source_env_file)| {
let (usage_count, usage_files) = counts.get(&name).cloned().unwrap_or_default();
KeyRow { orphan: usage_count == 0, name, source_env_file, usage_count, usage_files }
}).collect();
rows.sort_by(|a, b| a.name.cmp(&b.name));
Ok(SecretsReport { keys: rows, scanned_files, env_files: env_file_labels })
}
/// Render a `SecretsReport` as ASCII text.
pub fn render_ascii(r: &SecretsReport) -> String {
use std::fmt::Write as FmtWrite;
let mut out = String::new();
let mut by_file: BTreeMap<&str, Vec<&KeyRow>> = BTreeMap::new();
for row in &r.keys { by_file.entry(row.source_env_file.as_str()).or_default().push(row); }
for (file, rows) in &by_file {
let orphan_count = rows.iter().filter(|r| r.orphan).count();
let _ = writeln!(out, "[Secrets — {} ({} keys, {} orphan)]", file, rows.len(), orphan_count);
for row in rows.iter() { render_row(&mut out, row); }
}
let total_orphans = r.keys.iter().filter(|k| k.orphan).count();
let _ = writeln!(out, "Total: {} keys across {} env files, {} orphan",
r.keys.len(), r.env_files.len(), total_orphans);
out
}
fn render_row(out: &mut String, row: &KeyRow) {
use std::fmt::Write as FmtWrite;
if row.orphan {
let _ = writeln!(out, " {:<35} *ORPHAN* 0 refs — candidate for removal", row.name);
return;
}
let files_str = row.usage_files.join(", ");
let extra = if row.usage_count > row.usage_files.len() as u64 {
format!(", +{} more", row.usage_count - row.usage_files.len() as u64)
} else { String::new() };
let _ = writeln!(out, " {:<35} {:>4} refs ({}{})", row.name, row.usage_count, files_str, extra);
}
#[cfg(test)]
#[path = "secrets_tests.rs"]
mod tests;