bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
Working with large language models (LLMs) can be challenging due to their high memory demands, which often exceed the capabilities of standard hardware. This tool helps you efficiently run and train these powerful models by dramatically reducing the amount of computer memory they consume. It takes a large language model and outputs the same model, but optimized to use significantly less memory, allowing researchers and AI practitioners to work with advanced LLMs on more accessible computing resources.
8,033 stars. Used by 74 other packages. Actively maintained with 20 commits in the last 30 days. Available on PyPI.
Use this if you need to run or fine-tune large language models but are limited by your computer's memory capacity, especially on GPUs.
Not ideal if you are working with smaller models that don't face memory constraints or if you prefer to work exclusively with 32-bit precision for specific research reasons.
Stars
8,033
Forks
831
Language
Python
License
MIT
Category
Last pushed
Mar 10, 2026
Commits (30d)
20
Dependencies
3
Reverse dependents
74
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/bitsandbytes-foundation/bitsandbytes"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Recent Releases
Related models
intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model...
dropbox/hqq
Official implementation of Half-Quadratic Quantization (HQQ)
OpenGVLab/OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
Hsu1023/DuQuant
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger...
VITA-Group/Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.