openvinotoolkit/model_server

A scalable inference server for models optimized with OpenVINO™

/ 100

Verified

This tool helps software developers efficiently deploy and manage machine learning models, including large language models and generative AI models, in production environments. It takes trained models from various frameworks (like TensorFlow, ONNX) and makes them available via standard network protocols (REST or gRPC). The end-user is a software architect or MLOps engineer responsible for integrating AI models into applications.

836 stars. Actively maintained with 38 commits in the last 30 days.

Use this if you need a scalable and flexible way to serve machine learning models from various frameworks to client applications, especially in cloud or microservices-based architectures.

Not ideal if you are looking for a tool to train machine learning models or if your deployment needs are very simple and do not require high performance or remote inference capabilities.

MLOps Model Deployment Generative AI Microservices Cloud Computing

No Package No Dependents

Maintenance 20 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

836

Forks

241

Language

C++

License

Apache-2.0

Featured in

You're Shipping AI You Can't Measure

Related tools

madroidmaq/mlx-omni-server

MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically...

NVIDIA-NeMo/Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based...

generative-computing/mellea

Mellea is a library for writing generative programs.

rhesis-ai/rhesis

Open-source platform & SDK for testing LLM and agentic apps. Define expected behavior, generate...

taco-group/OpenEMMA

OpenEMMA, a permissively licensed open source "reproduction" of Waymo’s EMMA model.

Explore Generative AI Tools

All categories Trending Generative AI directory Insights