messkan/prompt-cache
Cut LLM costs by up to 80% and unlock sub-millisecond responses with intelligent semantic caching.A drop-in, provider-agnostic LLM proxy written in Go with sub-millisecond response
PromptCache helps development teams reduce expenses and speed up applications that use large language models (LLMs). By sitting between your application and the LLM provider, it detects semantically similar user requests and serves cached answers instantly. This is ideal for developers building AI-powered applications that experience repetitive queries, like customer support bots or AI agents.
209 stars.
Use this if you are developing an application that uses LLMs and you frequently see similar user prompts resulting in duplicate, costly, and slow API calls to your LLM provider.
Not ideal if your application primarily handles unique, non-repetitive user prompts where caching would offer minimal benefit.
Stars
209
Forks
19
Language
Go
License
MIT
Category
Last pushed
Jan 25, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/vector-db/messkan/prompt-cache"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
RediSearch/RediSearch
A query and indexing engine for Redis, providing secondary indexing, full-text search, vector...
redis/redis-vl-python
Redis Vector Library (RedisVL) -- the AI-native Python client for Redis.
redis-developer/redis-ai-resources
✨ A curated list of awesome community resources, integrations, and examples of Redis in the AI ecosystem.
redis-developer/redis-product-search
Visual and semantic vector similarity with Redis Stack, FastAPI, PyTorch and Huggingface.
luyug/GradCache
Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint