gasparian/ml-serving-template
Serving large ml models independently and asynchronously via message queue and kv-storage for communication with other services [EXPERIMENT]
This project helps machine learning engineers and MLOps professionals deploy large machine learning models into production environments. It takes requests from your existing applications and uses a message queue to send data to a separate, dedicated inference service. The result is a more robust and scalable architecture for serving predictions, especially for computationally intensive models.
No commits in the last 6 months.
Use this if your existing web application struggles to serve predictions from very large AI models or if your inference pipelines are too slow and cannot handle synchronous requests efficiently.
Not ideal if you are working with small, simple models that can be easily integrated directly into your web application without performance or scalability issues.
Stars
16
Forks
1
Language
Python
License
MIT
Category
Last pushed
Jul 20, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/mlops/gasparian/ml-serving-template"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
feast-dev/feast
The Open Source Feature Store for AI/ML
clearml/clearml-serving
ClearML - Model-Serving Orchestration and Repository Solution
lakehq/sail
LakeSail's computation framework with a mission to unify batch processing, stream processing,...
PaddlePaddle/Serving
A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)
SeldonIO/MLServer
An inference server for your machine learning models, including support for multiple frameworks,...