ShannonAI/service-streamer

Boosting your Web Services of Deep Learning Applications.

/ 100

Emerging

This tool helps developers make their deep learning web services faster and more efficient. It takes individual user requests, groups them into 'mini-batches,' and feeds them to the deep learning model. This process allows the model, especially on GPUs, to handle many requests concurrently, significantly boosting the service's overall speed and responsiveness. It's for machine learning engineers and developers who deploy deep learning models as web services.

1,244 stars. No commits in the last 6 months.

Use this if you are deploying a deep learning model as a web service and want to increase its speed and throughput, especially when using GPUs.

Not ideal if your application doesn't involve deep learning models, or if you are not deploying a web service that requires high-performance inference.

deep-learning-deployment web-service-optimization machine-learning-inference gpu-utilization natural-language-processing-deployment

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

1,244

Forks

187

Language

Python

License

Apache-2.0

Higher-rated alternatives

modelscope/modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

basetenlabs/truss

The simplest way to serve AI/ML models in production

Lightning-AI/LitServe

A minimal Python framework for building custom AI inference servers with full control over...

deepjavalibrary/djl-serving

A universal scalable machine learning model deployment solution

tensorflow/serving

A flexible, high-performance serving system for machine learning models

Explore ML Frameworks

All categories Trending ML Framework directory Insights