epwalsh/batched-fn
🦀 Rust server plugin for deploying deep learning models with batched prediction
When deploying deep learning models that process requests one-by-one, this tool helps you automatically group individual inputs into 'mini-batches' for more efficient GPU use. It takes single requests, batches them up, and then provides individual outputs back. This is for backend developers who need to optimize the performance and latency of deep learning model serving infrastructure.
No commits in the last 6 months.
Use this if you are building a web server for deep learning models and want to improve GPU utilization and reduce latency by automatically batching individual prediction requests.
Not ideal if your application primarily processes large batches of data offline or if your model is not GPU-intensive and doesn't benefit significantly from batching.
Stars
22
Forks
2
Language
Rust
License
Apache-2.0
Category
Last pushed
Mar 10, 2024
Monthly downloads
77
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/epwalsh/batched-fn"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
tracel-ai/burn
Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on...
sonos/tract
Tiny, no-nonsense, self-contained, Tensorflow and ONNX inference
pykeio/ort
Fast ML inference & training for ONNX models in Rust
elixir-nx/ortex
ONNX Runtime bindings for Elixir
robertknight/rten
ONNX neural network inference engine