gasparian/ml-serving-template

Serving large ml models independently and asynchronously via message queue and kv-storage for communication with other services [EXPERIMENT]

27
/ 100
Experimental

This project helps machine learning engineers and MLOps professionals deploy large machine learning models into production environments. It takes requests from your existing applications and uses a message queue to send data to a separate, dedicated inference service. The result is a more robust and scalable architecture for serving predictions, especially for computationally intensive models.

No commits in the last 6 months.

Use this if your existing web application struggles to serve predictions from very large AI models or if your inference pipelines are too slow and cannot handle synchronous requests efficiently.

Not ideal if you are working with small, simple models that can be easily integrated directly into your web application without performance or scalability issues.

MLOps Model Deployment Scalable AI Machine Learning Infrastructure Asynchronous Processing
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 5 / 25

How are scores calculated?

Stars

16

Forks

1

Language

Python

License

MIT

Last pushed

Jul 20, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/mlops/gasparian/ml-serving-template"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.