Madhan230205/token-reducer
⚡ Cut Claude token usage by 90%+ — free, open-source, local-first context compression for Claude Code. Hybrid RAG (BM25 + ONNX vectors), AST chunking, reranking. No API needed.
When working with Claude Code on large codebases, this tool intelligently sifts through your code to find only the most relevant parts for your specific question. It takes your entire codebase and a query, then outputs a much smaller, highly relevant snippet of code. This dramatically reduces the amount of text Claude processes, making your AI interactions faster and significantly cheaper. This is for any developer using Claude Code who wants to cut down on API costs and improve the accuracy of Claude's responses to code-related questions.
Use this if you are a developer using Claude Code for code analysis, refactoring, or question-answering on large projects and want to drastically reduce your Claude API costs while improving response quality.
Not ideal if you are not using Claude Code, primarily work with very small codebases, or prefer to send the entire context to your AI without any form of compression.
Stars
7
Forks
1
Language
Python
License
MIT
Category
Last pushed
Apr 03, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/Madhan230205/token-reducer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
trustgraph-ai/trustgraph
The context development platform. Store, enrich, and retrieve structured knowledge with...
vectorlessflow/vectorless
Vectorless is a hierarchical, reasoning-native document intelligence engine. 🌟 Star if you like it!
gabonavarroo/faultmap
Automatically discover where and why your LLM is failing — embedding-space clustering +...
hericlesferraz/DocVault
Intelligent Document RAG with citation extraction. Upload PDFs, DOCX, PPTX or images, ask...
sanjeevafk/BibleLM
A Bible chatbot powered by Retrieval-Augmented Generation (RAG), designed to uphold fidelity to...