MSNP1381/cache-cool

🌟 Cache-cool: A fast, flexible LLM caching proxy that reduces latency and API costs by caching repetitive calls to LLM services. πŸ”„ Supports dynamic configurations, πŸ“š multiple backends (πŸŸ₯ Redis, 🟒 MongoDB, πŸ“ JSON), and πŸ—οΈ schema-specific settings.

23
/ 100
Experimental

Cache-Cool helps developers working with Large Language Models (LLMs) by reducing the cost and improving the speed of their applications. It works by storing previous LLM responses, so if the same question or prompt is sent again, it can instantly provide the answer without contacting the LLM service. This is ideal for developers building applications that frequently interact with services like OpenAI or Claude.

No commits in the last 6 months.

Use this if you are a developer building an application that makes frequent, repetitive calls to LLM services and you want to reduce API costs and improve response times.

Not ideal if your application primarily involves unique, non-repetitive LLM queries, or if you are not comfortable setting up and managing a caching proxy with Docker or Python dependencies.

LLM application development API cost optimization application performance software engineering backend services
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 6 / 25

How are scores calculated?

Stars

29

Forks

2

Language

Python

License

Last pushed

Aug 17, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/MSNP1381/cache-cool"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.