vienneraphael/batchling
Save 50% off GenAI costs in two lines of code
This tool helps developers and ML engineers drastically cut costs when using Generative AI services for large-scale, non-urgent tasks. It takes your existing GenAI code, designed for real-time requests, and processes it as lower-cost batch jobs instead. This is for anyone building applications that involve mass data processing with AI models, without needing instant responses.
Available on PyPI.
Use this if you are running large volumes of Generative AI requests for tasks like data classification, summarization, or embedding generation where immediate responses are not critical.
Not ideal if your application requires real-time, instantaneous responses from Generative AI models.
Stars
17
Forks
—
Language
Python
License
MIT
Category
Last pushed
Mar 12, 2026
Commits (30d)
0
Dependencies
4
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/vienneraphael/batchling"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
openvinotoolkit/model_server
A scalable inference server for models optimized with OpenVINO™
madroidmaq/mlx-omni-server
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically...
NVIDIA-NeMo/Guardrails
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based...
generative-computing/mellea
Mellea is a library for writing generative programs.
rhesis-ai/rhesis
Open-source platform & SDK for testing LLM and agentic apps. Define expected behavior, generate...