zakariaf/RAG-Cache

High-performance LLM query cache with semantic search. Reduce API costs 80% and latency from 8.5s to 1ms using Redis + Qdrant vector DB. Multi-provider support (OpenAI, Anthropic).

/ 100

Emerging

This project helps reduce the cost and improve the speed of applications that use large language models like OpenAI or Anthropic. It takes your application's questions as input and, if a similar question has been asked before, returns a saved answer almost instantly. This is designed for developers building LLM-powered applications who want to optimize performance and control API expenses.

Use this if you are building an application that repeatedly queries large language models and want to save on API costs and significantly reduce response times.

Not ideal if your application primarily asks unique, never-before-seen questions where caching would offer minimal benefit.

LLM application development API cost optimization latency reduction application performance developer tools

No License No Package No Dependents

Maintenance 6 / 25

Adoption 5 / 25

Maturity 5 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

RediSearch/RediSearch

A query and indexing engine for Redis, providing secondary indexing, full-text search, vector...

redis/redis-vl-python

Redis Vector Library (RedisVL) -- the AI-native Python client for Redis.

redis-developer/redis-ai-resources

✨ A curated list of awesome community resources, integrations, and examples of Redis in the AI ecosystem.

redis-developer/redis-product-search

Visual and semantic vector similarity with Redis Stack, FastAPI, PyTorch and Huggingface.

luyug/GradCache

Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint

Explore Vector Databases

All categories Trending Vector Database directory Insights