openvinotoolkit/model_server

A scalable inference server for models optimized with OpenVINO™

71
/ 100
Verified

This tool helps software developers efficiently deploy and manage machine learning models, including large language models and generative AI models, in production environments. It takes trained models from various frameworks (like TensorFlow, ONNX) and makes them available via standard network protocols (REST or gRPC). The end-user is a software architect or MLOps engineer responsible for integrating AI models into applications.

836 stars. Actively maintained with 38 commits in the last 30 days.

Use this if you need a scalable and flexible way to serve machine learning models from various frameworks to client applications, especially in cloud or microservices-based architectures.

Not ideal if you are looking for a tool to train machine learning models or if your deployment needs are very simple and do not require high performance or remote inference capabilities.

MLOps Model Deployment Generative AI Microservices Cloud Computing
No Package No Dependents
Maintenance 20 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 25 / 25

How are scores calculated?

Stars

836

Forks

241

Language

C++

License

Apache-2.0

Last pushed

Mar 13, 2026

Commits (30d)

38

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/openvinotoolkit/model_server"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.