snu-mllab/GuidedQuant

Official PyTorch implementation of "GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance" (ICML 2025)

30
/ 100
Emerging

This project helps machine learning engineers and researchers make large language models (LLMs) more efficient. By applying advanced quantization techniques, it takes a full-sized LLM and outputs a significantly smaller, faster model that still performs well. The key benefit is running powerful LLMs on less powerful hardware or with faster inference speeds.

No commits in the last 6 months.

Use this if you need to deploy large language models more efficiently, reducing memory footprint and speeding up inference without a significant drop in performance.

Not ideal if you are working with smaller models that don't face significant computational or memory constraints.

large-language-models model-optimization deep-learning-deployment model-quantization AI-inference
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 8 / 25
Maturity 15 / 25
Community 5 / 25

How are scores calculated?

Stars

50

Forks

2

Language

Python

License

MIT

Last pushed

Jul 06, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/snu-mllab/GuidedQuant"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.