IST-DASLab/Quartet-II

Quartet II Official Code

/ 100

Emerging

This project helps machine learning engineers and researchers optimize the pre-training process for large language models. It provides tools and kernels to train these models using NVFP4 precision, a more efficient format, while maintaining accuracy. The project takes existing large language model architectures and training data, and outputs a more efficiently trained model.

Use this if you are a machine learning engineer or researcher focused on pre-training large language models and want to reduce computational costs and memory footprint without sacrificing model accuracy.

Not ideal if you are looking for a high-level API for using pre-trained models or for training smaller, non-LLM machine learning models.

large-language-model-training deep-learning-optimization AI-model-development computational-efficiency

No License No Package No Dependents

Maintenance 10 / 25

Adoption 8 / 25

Maturity 3 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

bitsandbytes-foundation/bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

intel/neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model...

dropbox/hqq

Official implementation of Half-Quadratic Quantization (HQQ)

OpenGVLab/OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Hsu1023/DuQuant

[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger...

Explore Transformer Models

All categories Trending Transformer directory Insights