triton-inference-server/pytriton
PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.
This tool helps machine learning engineers and data scientists deploy their Python-based machine learning models for predictions in production. You provide your trained model as a Python function, and it outputs a high-performance serving endpoint accessible via HTTP or gRPC, ready to handle inference requests. It's designed for those who need to put their models into real-world use.
835 stars. No commits in the last 6 months.
Use this if you need to quickly deploy your Python machine learning model (from frameworks like PyTorch, TensorFlow, or JAX) and serve predictions efficiently.
Not ideal if you are looking for a tool to train models or if your primary need is data preprocessing rather than model serving.
Stars
835
Forks
58
Language
Python
License
Apache-2.0
Category
Last pushed
Aug 13, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/triton-inference-server/pytriton"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
gpu-mode/Triton-Puzzles
Puzzles for learning Triton
hailo-ai/hailo_model_zoo
The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment
open-mmlab/mmdeploy
OpenMMLab Model Deployment Framework
hyperai/tvm-cn
TVM Documentation in Chinese Simplified / TVM 中文文档