France-Travail/happy_vllm

A REST API for vLLM, production ready

/ 100

Emerging

This tool helps developers serve large language models (LLMs) as a robust, production-ready web service. You provide your chosen LLM and it creates an API endpoint that can receive text prompts and return generated text responses. It's designed for software engineers and machine learning engineers who need to integrate LLM capabilities into their applications reliably.

Available on PyPI.

Use this if you are a developer looking to deploy and manage a vLLM-based language model as a stable, accessible REST API for your applications.

Not ideal if you are an end-user simply looking to chat with an LLM or if you don't have experience deploying web services.

LLM deployment API development Machine learning engineering Backend development AI model serving

Maintenance 6 / 25

Adoption 7 / 25

Maturity 25 / 25

Community 10 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

AGPL-3.0

Higher-rated alternatives

containers/ramalama

RamaLama is an open-source developer tool that simplifies the local serving of AI models from...

av/harbor

One command brings a complete pre-wired LLM stack with hundreds of services to explore.

RunanywhereAI/runanywhere-sdks

Production ready toolkit to run AI locally

runpod-workers/worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.

foldl/chatllm.cpp

Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)

Explore LLM Tools

All categories Trending LLM Tool directory Insights