A lightweight caching layer that prevents regenerating identical content. Saved approximately 60% of API quota in production by catching duplicate prompts before they hit the API.
import prompt_cache
# Check before calling expensive API
cached = await prompt_cache.get_cached(
prompt="Tell me a story about clouds",
child_name="Sophie",
language="fr"
)
if cached:
return cached # Free! No API call needed.
# Cache miss — call the API
result = await generate_story(prompt, child_name, language)
# Store for next time
await prompt_cache.set_cached(prompt, child_name, language, result)
CREATE TABLE IF NOT EXISTS prompt_cache (
prompt_hash TEXT NOT NULL,
child_name TEXT NOT NULL,
language TEXT NOT NULL,
story_json TEXT,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (prompt_hash, child_name, language)
);
The default implementation uses (prompt, child_name, language) as the cache key. Adapt to your domain:
(system_prompt, user_message, model)(text, voice_id, model_id)(prompt, seed, model, size)scripts/prompt_cache.py — Cache implementation (35 lines)共 1 个版本