sensoris/semcache

Semantic caching layer for your LLM applications. Reuse responses and reduce token usage.

/ 100

Emerging

This tool helps developers improve the performance and reduce the costs of applications that use large language models (LLMs). It acts as an intelligent intermediary, taking LLM queries as input and delivering cached, semantically similar responses when available, or passing the query to the LLM otherwise. Developers building LLM-powered applications are the primary users.

Use this if you are building an LLM application and want to reduce API costs and improve response times by intelligently reusing previous LLM outputs for similar queries.

Not ideal if your application requires every single LLM query to be processed by the underlying LLM provider, or if you need persistent storage out-of-the-box.

LLM-application-development API-cost-optimization application-performance AI-application-deployment

No Package No Dependents

Maintenance 6 / 25

Adoption 9 / 25

Maturity 15 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Rust

License

MIT

Higher-rated alternatives

ModelEngine-Group/unified-cache-management

Persist and reuse KV Cache to speedup your LLM.

reloadware/reloadium

Hot Reloading and Profiling for Python

October2001/Awesome-KV-Cache-Compression

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

alibaba/tair-kvcache

Alibaba Cloud's high-performance KVCache system for LLM inference, with components for global...

Zefan-Cai/Awesome-LLM-KV-Cache

Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.

Explore LLM Tools

All categories Trending LLM Tool directory Insights