KeiSeiKit-1.0/_blocks/api-anthropic.md
denis 0b901cf2f9 feat: KeiSeiKit v0.1.0 — initial public release
Generic Constructor-Pattern agent kit for Claude Code. Zero personal data,
fully English, MIT-licensed.

Contents:
- 34 reusable blocks (baseline, rules, stack/deploy/domain/api/scraper)
- 14 cross-project agent manifests (code/ml/infra/researcher/critic/...)
- 6 portable skills (/new-agent, /research, /test-gen, /debug-deep, /pr-review, /refactor)
- Rust assembler (single binary, ~500 KB)
- 3 hooks (auto-reassemble, pre-commit validate, no-hand-edit)
- install.sh (idempotent, cargo-builds on first run)
- MIT LICENSE

All 6 sanity greps pass: 0 Russian text, 0 specific project names,
0 incident numbers, 0 user paths, 0 hardcoded IPs, 0 API keys.

cargo check + assemble --validate: both pass on 14 manifests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 23:58:34 +08:00

2.2 KiB

API — Anthropic (Claude)

Full text: Anthropic docs (WebFetch https://docs.anthropic.com/en/api before any new feature). Claude API skill trigger: code imports anthropic / @anthropic-ai/sdk.

Model IDs (from env, never hard-code):

  • Opus tier — max effort, 1M input tokens on the [1m] variant
  • Sonnet tier — balanced cost / capability
  • Haiku tier — cheapest, latency-critical
  • Keep ID in env var (ANTHROPIC_MODEL) — swapping Opus→Sonnet should be 0 code changes.

Prompt caching (up to ~90% cost reduction + latency drop on cache hit):

  • 4 cache breakpoints per request (cache_control: {type: "ephemeral"})
  • Two TTLs: default 5-min (cheap writes) and 1-hour (premium writes, higher $/token)
  • Same prefix sent >N times → MUST cache_control — missing caching on a long system prompt is free money left on the table
  • Log cache_read_input_tokens vs cache_creation_input_tokens every call — if read is zero across N calls, cache is mis-wired

Tool use:

  • Fine-grained tool streaming supported (parse tool_use deltas, don't wait for full turn)
  • tool_choice: "auto" | "any" | {type: "tool", name} — pick any when you need some tool but don't care which
  • Cap turn loop with max_iterations (default 10) — infinite loop on broken tool = infinite cost
  • Every tool_use MUST have matching tool_result — orphan tool_use errors mid-turn

Batch API: 50% discount, 24h window. Use for offline eval / bulk-ingest / non-interactive tasks. Polling via batch ID.

Extended thinking: thinking: {type: "enabled", budget_tokens: N}. Higher budget → deeper reasoning. Visible thinking is billed; hidden is not streamed but still billed.

Cost tracking (mandatory per-call log): input_tokens, output_tokens, cache_read_input_tokens, cache_creation_input_tokensmemory/{project}.md. Rates change — WebFetch https://www.anthropic.com/pricing before any budgeted run [VERIFY: live pricing page].

Forbidden: hard-coding model strings in source (use env var); using deprecated IDs without a migration note citing the replacement; sending the same >2K-token prefix >3 times without cache_control; skipping per-call cost log (no data → no decisions).