gasparian/ml-serving-template

Serving large ml models independently and asynchronously via message queue and kv-storage for communication with other services [EXPERIMENT]

/ 100

Experimental

This project helps machine learning engineers and MLOps professionals deploy large machine learning models into production environments. It takes requests from your existing applications and uses a message queue to send data to a separate, dedicated inference service. The result is a more robust and scalable architecture for serving predictions, especially for computationally intensive models.

No commits in the last 6 months.

Use this if your existing web application struggles to serve predictions from very large AI models or if your inference pipelines are too slow and cannot handle synchronous requests efficiently.

Not ideal if you are working with small, simple models that can be easily integrated directly into your web application without performance or scalability issues.

MLOps Model Deployment Scalable AI Machine Learning Infrastructure Asynchronous Processing

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 5 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

feast-dev/feast

The Open Source Feature Store for AI/ML

clearml/clearml-serving

ClearML - Model-Serving Orchestration and Repository Solution

lakehq/sail

LakeSail's computation framework with a mission to unify batch processing, stream processing,...

PaddlePaddle/Serving

A flexible, high-performance carrier for machine learning models（『飞桨』服务化部署框架）

SeldonIO/MLServer

An inference server for your machine learning models, including support for multiple frameworks,...

Explore MLOps Tools

All categories Trending MLOps directory Insights