renswickd/semantic-prompt-cache

This app leverages Semantic Caching to minimize inference latency and reduce API costs by reusing semantically similar prompt responses.

/ 100

Experimental

This system helps AI application developers, particularly those building chatbots or knowledge assistants, to make their applications faster and cheaper to run. It takes user questions and, if a similar question has been asked before, reuses the previous answer instead of generating a new one. This means quicker responses for users and reduced costs for using large language models.

No commits in the last 6 months.

Use this if you are building an AI application with a Retrieval-Augmented Generation (RAG) pipeline and need to reduce latency and API costs for similar or repeated user queries.

Not ideal if your application primarily handles unique, one-off queries where caching similar responses would not provide significant benefits.

AI application development chatbot optimization enterprise knowledge management LLM cost reduction system responsiveness

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 5 / 25

Maturity 7 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

promptfoo/promptfoo

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI....

bigscience-workshop/promptsource

Toolkit for creating, sharing and using natural language prompts.

dair-ai/Prompt-Engineering-Guide

🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering,...

thunlp/OpenPrompt

An Open-Source Framework for Prompt-Learning.

promptfoo/promptfoo-action

The GitHub Action for Promptfoo. Test your prompts, agents, and RAGs. AI Red teaming,...

Explore Prompt Engineering Tools

All categories Trending Prompt Engineering directory Insights