zilliztech/GPTCache
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
This tool helps developers and architects manage the costs and improve the speed of applications that use large language models (LLMs) like ChatGPT. By caching previous LLM queries and their responses, it allows you to reuse answers for identical or similar questions. This is ideal for anyone building applications that frequently interact with LLMs.
7,963 stars. No commits in the last 6 months. Available on PyPI.
Use this if you are developing an application that repeatedly asks similar questions to LLMs and want to reduce API costs and improve response times.
Not ideal if your application primarily uses LLMs for unique, never-before-seen queries where caching offers no benefit.
Stars
7,963
Forks
570
Language
Python
License
MIT
Category
Last pushed
Jul 11, 2025
Commits (30d)
0
Dependencies
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/zilliztech/GPTCache"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
aiming-lab/SimpleMem
SimpleMem: Efficient Lifelong Memory for LLM Agents
zilliztech/memsearch
A Markdown-first memory system, a standalone library for any AI agent. Inspired by OpenClaw.
microsoft/kernel-memory
Research project. A Memory solution for users, teams, and applications.
TeleAI-UAGI/telemem
TeleMem is a high-performance drop-in replacement for Mem0, featuring semantic deduplication,...
RichmondAlake/memorizz
MemoRizz: A Python library serving as a memory layer for AI applications. Leverages popular...