mlcommons/inference

Reference implementations of MLPerf® inference benchmarks

/ 100

Verified

This project offers standardized benchmarks to measure how quickly various systems can run machine learning models across different deployment scenarios. It takes in various machine learning models (like ResNet, BERT, Llama2) and system configurations, providing performance metrics like inference speed. System architects, hardware engineers, and ML platform developers use this to compare and optimize the performance of their AI systems.

1,539 stars. Actively maintained with 25 commits in the last 30 days.

Use this if you need to objectively evaluate and compare the inference speed of different hardware and software configurations for your machine learning deployments.

Not ideal if you are looking for tools to train machine learning models or to optimize model accuracy rather than deployment performance.

AI system performance ML model deployment hardware benchmarking system optimization inference speed evaluation

No Package No Dependents

Maintenance 20 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

1,539

Forks

612

Language

Python

License

Apache-2.0

Compare

inference and training inference and inference_results_v5.1

Related frameworks

NVIDIA/TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit...

mlcommons/training

Reference implementations of MLPerf® training benchmarks

datamade/usaddress

:us: a python library for parsing unstructured United States address strings into address components

GRAAL-Research/deepparse

Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning

CMU-SAFARI/Pythia

A customizable hardware prefetching framework using online reinforcement learning as described...

Explore ML Frameworks

All categories Trending ML Framework directory Insights