TristanBilot/mlx-benchmark
Benchmark of Apple MLX operations on all Apple Silicon chips (GPU, CPU) + MPS and CUDA.
This tool helps machine learning engineers and researchers understand the performance of MLX operations on various Apple Silicon chips (M1-M4) and compare them against PyTorch on Apple's MPS, CPU, and NVIDIA CUDA GPUs. It takes your specified hardware and MLX/PyTorch versions as input, and outputs detailed or averaged runtime benchmarks for different machine learning operations. It's ideal for those optimizing machine learning models for Apple hardware.
217 stars.
Use this if you are developing machine learning applications and need to compare the speed and efficiency of different ML frameworks and hardware configurations for specific operations.
Not ideal if you are looking for a high-level application performance monitor or a tool to benchmark entire machine learning model training workflows rather than individual operations.
Stars
217
Forks
30
Language
Python
License
MIT
Category
Last pushed
Mar 08, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/TristanBilot/mlx-benchmark"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit...
mlcommons/inference
Reference implementations of MLPerf® inference benchmarks
mlcommons/training
Reference implementations of MLPerf® training benchmarks
datamade/usaddress
:us: a python library for parsing unstructured United States address strings into address components
GRAAL-Research/deepparse
Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning