Single-commit clean baseline after security scrub of niche-tells, project codenames, internal jargon, and contributor-email leaks. Contents: - 100 Rust crates (_primitives/_rust/) - 37 agent manifests (_manifests/) + generated specs (_generated/) - 67 user-invocable skills (skills/) - 33 hooks (hooks/) - Composition blocks (_blocks/) - Documentation (docs/, README.md) - TS adapter packages (_ts_packages/) - Assembler (_assembler/) - Roles (_roles/) - Templates (_templates/) - Forgejo CI (.forgejo/) Author: Denis Parfionovich <info@greendragon.info> License: see LICENSE.
2.5 KiB
API — ElevenLabs (voice)
Live pricing: WebFetch https://elevenlabs.io/pricing before any bulk run [VERIFY: character pricing tier varies by plan].
MANDATORY 3-step Voice Design flow (order is fixed):
designVoice— describe voice characteristics (gender, age, accent, style) → returns preview audio +generated_voice_id(ephemeral).createVoice— accept the preview → permanentvoice_idadded to library.- TTS — synthesize text using the permanent
voice_id.
Skipping or reordering any step = API error. Ephemeral preview IDs expire — cannot TTS directly from designVoice output.
Models:
| Model | Use case | Latency | Quality |
|---|---|---|---|
eleven_flash_v2_5 |
Real-time, low latency (~75ms) | Fastest | Good |
eleven_multilingual_v2 |
Production, 29 languages | Slower | Best |
eleven_turbo_v2_5 |
Balanced | Fast | High |
Pricing [VERIFY: check live pricing page] — billed per character, plan-gated character quota:
- Free: ~10K chars/mo
- Starter: ~30K chars/mo
- Creator / Pro / Scale — higher quotas, character overage rates vary per plan.
- Voice Design calls also consume characters (preview audio counts).
TTS params (sane defaults):
stability: 0.5— higher = more monotone, lower = more expressive (range 0-1)similarity_boost: 0.75— higher = closer to reference voicestyle: 0-1— emotional exaggeration; set 0 for Flash v2 (not supported)use_speaker_boost: truefor Multilingual v2
Voice ID caching: once createVoice returns a voice_id, store it in memory/{project}.md or DB. Reuse across TTS calls — re-designing the same voice = wasted characters + non-deterministic result.
Video integration (if pairing with a video model that supports voice): voice_id flows into the video model's voice_ids payload. Per-speaker markers in prompts ONLY when voice_ids actually sent.
Cost tracking: log per-call characters_used + cumulative month-to-date → memory/{project}.md. Hand off to kei-cost-guardian on any batch expected to exceed 50% of monthly quota.
Forbidden: calling TTS without prior createVoice (ephemeral preview IDs fail); exceeding plan character quota without kei-cost-guardian check (overage billing surprise); committing voice_id values into git when they reference private/cloned voices (storage convention — see domain-has-secrets.md); re-designing the same voice per-scene instead of caching voice_id; skipping the 3-step flow with direct TTS on generated_voice_id.