France-Travail/happy_vllm
A REST API for vLLM, production ready
This tool helps developers serve large language models (LLMs) as a robust, production-ready web service. You provide your chosen LLM and it creates an API endpoint that can receive text prompts and return generated text responses. It's designed for software engineers and machine learning engineers who need to integrate LLM capabilities into their applications reliably.
Available on PyPI.
Use this if you are a developer looking to deploy and manage a vLLM-based language model as a stable, accessible REST API for your applications.
Not ideal if you are an end-user simply looking to chat with an LLM or if you don't have experience deploying web services.
Stars
27
Forks
3
Language
Python
License
AGPL-3.0
Category
Last pushed
Oct 20, 2025
Commits (30d)
0
Dependencies
6
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/France-Travail/happy_vllm"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
containers/ramalama
RamaLama is an open-source developer tool that simplifies the local serving of AI models from...
av/harbor
One command brings a complete pre-wired LLM stack with hundreds of services to explore.
RunanywhereAI/runanywhere-sdks
Production ready toolkit to run AI locally
runpod-workers/worker-vllm
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
foldl/chatllm.cpp
Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)