inference and training
These are ecosystem siblings, representing reference implementations for the inference and training benchmarks, respectively, within the broader MLPerf® benchmarking suite.
About inference
mlcommons/inference
Reference implementations of MLPerf® inference benchmarks
This project offers standardized benchmarks to measure how quickly various systems can run machine learning models across different deployment scenarios. It takes in various machine learning models (like ResNet, BERT, Llama2) and system configurations, providing performance metrics like inference speed. System architects, hardware engineers, and ML platform developers use this to compare and optimize the performance of their AI systems.
About training
mlcommons/training
Reference implementations of MLPerf® training benchmarks
This project provides standardized training benchmarks for machine learning models across various domains like language processing, image generation, and recommendation systems. It takes a specific dataset and a chosen model implementation as input, and outputs the time it takes to train that model to a target quality. It is used by deep learning engineers and researchers who want to objectively evaluate the training performance of different ML hardware and software setups.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work