Elijas/token-throttle
Multi-resource rate limiting for LLM APIs. Reserve tokens before you call, refund what you don't use, stay under the limit across workers.
When you're working with large language models (LLMs) and making many API calls, especially across different applications or in batch processes, it's easy to hit rate limits and get errors or dramatically slow down your work. This tool helps you manage those limits by letting you reserve the tokens you expect to use before you make a call, and then refund any unused capacity afterward. This ensures you maximize your allowed usage without exceeding your provider's limits, making it ideal for developers building LLM-powered applications.
Use this if you are a developer integrating LLMs into applications and need to manage API rate limits efficiently across multiple concurrent calls or distributed systems.
Not ideal if you are making only occasional, single LLM calls and do not need to optimize for high-volume or concurrent usage.
Stars
17
Forks
2
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 13, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/Elijas/token-throttle"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
valmi-io/value
⚡ "Value" - https://value.valmi.io . Valmi Value is Outcome-based billing and payments...
aptible/unpage
Unpage is the open source framework for building SRE agents with infrastructure context and...
Harshit-J004/toolguard
Pytest-style reliability testing for AI agent tool chains. Catches hallucinated payloads, schema...
dipampaul17/AgentGuard
Real-time guardrail that shows token spend & kills runaway LLM/agent loops.
2001Haru/TokenWaster
The Most Useless EVER agent assistant in the Human History. Always trying to read everything in...