ProbioticFarmer/mlx-deterministic

Batch-invariant operations for deterministic LLM inference on Apple Silicon using MLX

/ 100

Emerging

When performing large language model (LLM) inference on Apple Silicon, you might notice that the same prompt yields slightly different responses depending on how many prompts you process at once (batch size). This tool ensures that your LLM outputs are always identical and reproducible, regardless of the batch size. It provides specific operations that maintain output consistency, making your LLM-powered applications more reliable. This is for AI/ML practitioners and researchers who need consistent and verifiable LLM outputs.

Use this if you need to guarantee that your LLM generates bitwise-identical outputs for the same input, regardless of the batch size used, which is critical for testing, validation, and auditing.

Not ideal if your primary concern is raw inference speed and you can tolerate minor variations in LLM outputs between different batch sizes.

LLM-inference AI-reproducibility ML-validation Model-testing Responsible-AI

No Package No Dependents

Maintenance 6 / 25

Adoption 4 / 25

Maturity 15 / 25

Community 13 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

NVIDIA/TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit...

mlcommons/inference

Reference implementations of MLPerf® inference benchmarks

mlcommons/training

Reference implementations of MLPerf® training benchmarks

datamade/usaddress

:us: a python library for parsing unstructured United States address strings into address components

GRAAL-Research/deepparse

Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning

Explore ML Frameworks

All categories Trending ML Framework directory Insights