KeiSeiKit-1.0/_primitives/_rust/kei-buddy/src/topic_classify.rs
Parfii-bot 450156a476 feat(kei-buddy fleet): 5 atomics — google/apple contacts + classifier + tick + slash-commands
Parallel agent batch. All five tasks delivered functional + tested.
NOT deployed — user is in live conversation with the bot.

## Crates added (2 new)

### kei-contacts-google (466 LOC, 5 tests)
Thin Google People API client. Takes pre-acquired access_token from
kei-auth-google's OAuth flow; calls /v1/people/me/connections?personFields=...,
parses 200-entry first page (TODO: pagination via nextPageToken), maps
to kei_social_store::Person. Errors: Http / Auth(401) / Parse.

### kei-contacts-apple (593 LOC, 7 tests + 1 doc-test)
CardDAV client for iCloud Contacts using Basic Auth (Apple ID +
app-specific password). Sends REPORT with addressbook-query XML body,
parses multistatus → embedded vCards → AppleContact. Tiny vCard
parser (~150 LOC) handles FN/N/EMAIL/TEL/ORG/NOTE/UID, single-line
only (no line-folding for MVP). Discovery (PROPFIND .well-known/carddav
→ principal → addressbook-home-set) deferred — user supplies
addressbook URL via with_addressbook_url().

Both crates registered in workspace members.

## kei-buddy crate additions

### src/topic_classify.rs (116 LOC, 3 tests)
Free fn classify_and_store_topic(extractor, topics, chat_id, text)
called from process_text when state == OnboardState::Ready. Builds
classifier prompt → LLM → parses {slug, title} → validates slug
shape (kebab-case, ascii) → Topics::add_topic + add_digest. All
failure paths log + return; conversation never blocks.

### src/tick.rs (188 LOC, 3 integration tests) + src/bin/kei-buddy-tick.rs (67 LOC)
Second binary. Oneshot CLI for systemd timer: walks all known
chat_ids in BuddyStore → lists topics → searches recent chat
messages per topic (configurable window/limit) → LLM digest →
Topics::add_digest. Outputs JSON TickReport to stdout. Env-driven
config. NoOpExtractor fallback when no LLM creds (graceful degradation).

### src/commands.rs (146 LOC) + src/command_exec.rs (111 LOC, 7 tests)
Slash-commands intercepted BEFORE handle_step in process_text:
  /whois <name>   contacts.search_contacts + common_connections for hits
  /find <q>       chat_log.search scoped to chat_id
  /topics         topics.list_topics
  /contacts       contacts.search_contacts("", 10)
  /help           static usage text (Russian)
If command parsed, response built from stores, sent, logged to
chat_log — FSM skipped for that turn.

### src/serve_runner.rs (69 LOC) — refactor
run_serve + start_listener + init_tracing extracted out of serve.rs
to bring serve.rs back to 189 LOC (was 248 after previous wave).

### Wiring
BuddyContext gains `contacts: Arc<Contacts>` and `topics: Arc<Topics>`.
ServeConfig gains contacts_db_path + topics_db_path. Binary reads
KEI_BUDDY_CONTACTS_DB_PATH + KEI_BUDDY_TOPICS_DB_PATH env (defaults
./kei-buddy-contacts.db, ./kei-buddy-topics.db). cmd_migrate applies
schema for all three side-stores (chat_log + contacts + topics).

## Verify-before-commit (RULE 0.13 §)
  * cargo check -p kei-buddy (default + extractor-openai): PASS
  * cargo test -p kei-buddy --lib: 41 passed / 0 failed (was 31)
  * cargo test -p kei-buddy --tests: 3 passed (tick integration)
  * cargo build -p kei-buddy --features extractor-openai: PASS
    (builds both kei-buddy + kei-buddy-tick binaries)
  * cargo check -p kei-contacts-google: PASS (5 tests)
  * cargo check -p kei-contacts-apple: PASS (7 + 1 doc)
  * cargo check --workspace: PASS

## STATUS-TRUTH from all 5 agents: shipped=functional, behaviour-verified=yes

## Follow-up (deferred, non-blocking)
  * Google People API pagination (nextPageToken loop) — first 200 only
  * CardDAV auto-discovery (PROPFIND .well-known/carddav)
  * vCard line-folding (RFC 6350 §3.2)
  * Wire kei-contacts-google + kei-contacts-apple → Contacts.add_contact
    sync command (no glue yet)
  * systemd timer file for kei-buddy-tick (not shipped here — config only)
2026-05-12 16:33:58 +08:00

109 lines
3.8 KiB
Rust

// SPDX-License-Identifier: Apache-2.0
//! Topic classification helper — free function invoked after `OnboardState::Ready`.
//!
//! Constructor Pattern: one responsibility — LLM classify + Topics store, fire-and-forget.
use std::time::{SystemTime, UNIX_EPOCH};
use tracing::{error, warn};
use crate::{extractor::LlmExtractor, topics::Topics};
const CLASSIFY_PROMPT: &str = concat!(
"You are a topic classifier. Output a single JSON object with two string fields: ",
"\"slug\" (kebab-case, ascii, ≤30 chars, like \"work-meetings\") and ",
"\"title\" (human-readable in the user's language, ≤50 chars). ",
"Classify the following user message into ONE topic. ",
"Output only the JSON, no prose, no markdown fences."
);
/// Classify `text` into a topic and store it in `topics`. Never panics; never returns `Err`.
pub async fn classify_and_store_topic(
extractor: &dyn LlmExtractor,
topics: &Topics,
chat_id: i64,
text: &str,
) {
let val = match extractor.extract(CLASSIFY_PROMPT, text).await {
Ok(v) => v,
Err(e) => {
warn!(chat_id, error = %e, "topic classifier LLM call failed");
return;
}
};
let slug = match val.get("slug").and_then(|v| v.as_str()) {
Some(s) if !s.is_empty() => s.to_string(),
_ => {
warn!(chat_id, "topic classifier returned no slug field");
return;
}
};
let title = match val.get("title").and_then(|v| v.as_str()) {
Some(s) if !s.is_empty() => s.to_string(),
_ => {
warn!(chat_id, "topic classifier returned no title field");
return;
}
};
if !is_valid_slug(&slug) {
warn!(chat_id, slug = %slug, "topic slug failed validation; skipping");
return;
}
if let Err(e) = topics.add_topic(chat_id, &slug, &title, text).await {
error!(chat_id, slug = %slug, error = %e, "topics.add_topic failed");
}
let now = SystemTime::now()
.duration_since(UNIX_EPOCH)
.map(|d| d.as_secs() as i64)
.unwrap_or(0);
if let Err(e) = topics.add_digest(chat_id, &slug, now, text).await {
error!(chat_id, slug = %slug, error = %e, "topics.add_digest failed");
}
}
fn is_valid_slug(slug: &str) -> bool {
!slug.is_empty()
&& slug.len() <= 40
&& slug.chars().all(|c| c.is_ascii_alphanumeric() || c == '-' || c == '_')
}
// ── Tests ─────────────────────────────────────────────────────────────────────
#[cfg(test)]
mod tests {
use super::*;
use crate::{extractor::MockExtractor, topics::Topics};
use serde_json::json;
async fn make_topics() -> Topics {
Topics::from_memory().unwrap()
}
#[tokio::test]
async fn classify_and_store_skips_invalid_slug() {
let extractor = MockExtractor::new(json!({"slug": "has spaces", "title": "X"}));
let topics = make_topics().await;
classify_and_store_topic(&extractor, &topics, 1, "hello").await;
assert!(topics.list_topics(1).await.unwrap().is_empty());
}
#[tokio::test]
async fn classify_and_store_adds_topic_for_valid_slug() {
let extractor = MockExtractor::new(json!({"slug": "work-stuff", "title": "Work Stuff"}));
let topics = make_topics().await;
classify_and_store_topic(&extractor, &topics, 1, "I have a meeting").await;
assert_eq!(topics.list_topics(1).await.unwrap().len(), 1);
}
#[tokio::test]
async fn classify_and_store_handles_missing_fields() {
let extractor = MockExtractor::new(json!({}));
let topics = make_topics().await;
classify_and_store_topic(&extractor, &topics, 1, "any text").await;
assert!(topics.list_topics(1).await.unwrap().is_empty());
}
}