triton-inference-server/model_analyzer

Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.

/ 100

Established

This tool helps machine learning engineers and MLOps professionals optimize how their AI models run on NVIDIA's Triton Inference Server. It takes your model files and hardware specifications to generate configurations that balance performance, latency, and resource usage. The output includes detailed reports showing the trade-offs of different settings, helping you choose the best setup for your production environment.

507 stars. Actively maintained with 4 commits in the last 30 days.

Use this if you need to fine-tune the deployment of your AI models on Triton Inference Server to meet specific performance or resource efficiency targets.

Not ideal if you are looking for a tool to train or develop your AI models, as this focuses solely on optimizing their deployment and inference.

MLOps AI Inference Optimization Deep Learning Deployment Model Serving Performance Tuning

No Package No Dependents

Maintenance 13 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

507

Forks

Language

Python

License

Apache-2.0

Compare

model_analyzer and model_navigator

Related frameworks

triton-inference-server/server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

gpu-mode/Triton-Puzzles

Puzzles for learning Triton

hailo-ai/hailo_model_zoo

The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment

open-mmlab/mmdeploy

OpenMMLab Model Deployment Framework

hyperai/tvm-cn

TVM Documentation in Chinese Simplified / TVM 中文文档

Explore ML Frameworks

All categories Trending ML Framework directory Insights