bentoml/OpenLLM
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
This project helps software developers, machine learning engineers, and data scientists easily host and serve open-source Large Language Models (LLMs) from their own cloud infrastructure. It takes various open-source LLMs as input and outputs an OpenAI-compatible API endpoint, making it straightforward to integrate these models into applications. The primary users are developers building AI-powered applications who need to deploy and manage LLMs.
12,161 stars. Available on PyPI.
Use this if you are a developer who wants to run and expose open-source or custom LLMs via an OpenAI-compatible API, either on your local machine or in the cloud.
Not ideal if you are an end-user looking for a ready-to-use application and do not have programming or cloud infrastructure experience.
Stars
12,161
Forks
803
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 09, 2026
Commits (30d)
0
Dependencies
15
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/bentoml/OpenLLM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related models
ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
withcatai/node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema...
mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and...
zhudotexe/kani
kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)
SciSharp/LLamaSharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.