ProbioticFarmer/mlx-deterministic
Batch-invariant operations for deterministic LLM inference on Apple Silicon using MLX
When performing large language model (LLM) inference on Apple Silicon, you might notice that the same prompt yields slightly different responses depending on how many prompts you process at once (batch size). This tool ensures that your LLM outputs are always identical and reproducible, regardless of the batch size. It provides specific operations that maintain output consistency, making your LLM-powered applications more reliable. This is for AI/ML practitioners and researchers who need consistent and verifiable LLM outputs.
Use this if you need to guarantee that your LLM generates bitwise-identical outputs for the same input, regardless of the batch size used, which is critical for testing, validation, and auditing.
Not ideal if your primary concern is raw inference speed and you can tolerate minor variations in LLM outputs between different batch sizes.
Stars
7
Forks
2
Language
Python
License
MIT
Category
Last pushed
Dec 12, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/ProbioticFarmer/mlx-deterministic"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit...
mlcommons/inference
Reference implementations of MLPerf® inference benchmarks
mlcommons/training
Reference implementations of MLPerf® training benchmarks
datamade/usaddress
:us: a python library for parsing unstructured United States address strings into address components
GRAAL-Research/deepparse
Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning