sensoris/semcache
Semantic caching layer for your LLM applications. Reuse responses and reduce token usage.
This tool helps developers improve the performance and reduce the costs of applications that use large language models (LLMs). It acts as an intelligent intermediary, taking LLM queries as input and delivering cached, semantically similar responses when available, or passing the query to the LLM otherwise. Developers building LLM-powered applications are the primary users.
Use this if you are building an LLM application and want to reduce API costs and improve response times by intelligently reusing previous LLM outputs for similar queries.
Not ideal if your application requires every single LLM query to be processed by the underlying LLM provider, or if you need persistent storage out-of-the-box.
Stars
94
Forks
4
Language
Rust
License
MIT
Category
Last pushed
Jan 02, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/sensoris/semcache"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ModelEngine-Group/unified-cache-management
Persist and reuse KV Cache to speedup your LLM.
reloadware/reloadium
Hot Reloading and Profiling for Python
October2001/Awesome-KV-Cache-Compression
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
alibaba/tair-kvcache
Alibaba Cloud's high-performance KVCache system for LLM inference, with components for global...
Zefan-Cai/Awesome-LLM-KV-Cache
Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.