bentoml/BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

/ 100

Verified

AI/ML engineers use BentoML to turn their trained machine learning models into live prediction services. It takes your model code and dependencies, packages them into a standardized format, and creates a deployable API endpoint. This allows practitioners to easily serve various AI applications, from language models to computer vision systems, making them accessible for real-world use.

8,516 stars. Used by 4 other packages. Actively maintained with 14 commits in the last 30 days. Available on PyPI.

Use this if you need to quickly and efficiently deploy your trained AI or machine learning models as production-ready web services that can handle real-time inference requests.

Not ideal if you are looking for a platform to train your models or perform data preprocessing, as this tool focuses specifically on serving already-trained models.

AI deployment Machine learning operations Model serving API development Production AI

Maintenance 17 / 25

Adoption 14 / 25

Maturity 25 / 25

Community 20 / 25

How are scores calculated?

Stars

8,516

Forks

927

Language

Python

License

Apache-2.0

Recent Releases

v1.4.38 02 Apr 2026 v1.4.37 25 Mar 2026 v1.4.36 06 Mar 2026 v1.4.35 03 Feb 2026 v1.4.34 26 Jan 2026

Related tools

nndeploy/nndeploy

一款简单易用和高性能的AI部署框架 | An Easy-to-Use and High-Performance AI Deployment Framework

kubeflow/trainer

Distributed AI Model Training and LLM Fine-Tuning on Kubernetes

cncf/llm-in-action

🤖 Discover how to apply your LLM app skills on Kubernetes!

llmcloud24/de.KCD-Summer-School-2024

Learn how to deploy your own LLM in the de.NBI cloud via a step-by-step guided journey...

ray-project/llms-in-prod-workshop-2023

Deploy and Scale LLM-based applications

Explore MLOps Tools

All categories Trending MLOps directory Insights