Llm Knowledge Distillation Transformer Models
There are 38 llm knowledge distillation models tracked. 3 score above 50 (established tier). The highest-rated is scaleapi/llm-engine at 63/100 with 821 stars. 2 of the top 10 are actively maintained.
Get all 38 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-knowledge-distillation&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
scaleapi/llm-engine
Scale LLM Engine public repository |
|
Established |
| 2 |
AGI-Arena/MARS
The official implementation of MARS: Unleashing the Power of Variance... |
|
Established |
| 3 |
modelscope/easydistill
a toolkit on knowledge distillation for large language models |
|
Established |
| 4 |
AGI-Edgerunners/LLM-Adapters
Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for... |
|
Emerging |
| 5 |
Wang-ML-Lab/bayesian-peft
Bayesian Low-Rank Adaptation of LLMs: BLoB [NeurIPS 2024] and TFB [NeurIPS 2025] |
|
Emerging |
| 6 |
sangmichaelxie/doremi
Pytorch implementation of DoReMi, a method for optimizing the data mixture... |
|
Emerging |
| 7 |
ZO-Bench/ZO-LLM
[ICML‘24] Official code for the paper "Revisiting Zeroth-Order Optimization... |
|
Emerging |
| 8 |
Liuhong99/Sophia
The official implementation of “Sophia: A Scalable Stochastic Second-order... |
|
Emerging |
| 9 |
ShiZhengyan/DePT
[ICLR 2024] This is the repository for the paper titled "DePT: Decomposed... |
|
Emerging |
| 10 |
YJiangcm/Lion
[EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models |
|
Emerging |
| 11 |
golololologol/LLM-Distillery
A pipeline for LLM knowledge distillation |
|
Emerging |
| 12 |
shufangxun/LLaVA-MoD
[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation |
|
Emerging |
| 13 |
horus-ai-labs/DistillFlow
Library for model distillation |
|
Emerging |
| 14 |
yifanzhang-pro/HLA
Official Project Page for HLA: Higher-order Linear Attention... |
|
Emerging |
| 15 |
OatmealLiu/FineR
[ICLR'24] Democratizing Fine-grained Visual Recognition with Large Language Models |
|
Emerging |
| 16 |
Tebmer/Awesome-Knowledge-Distillation-of-LLMs
This repository collects papers for "A Survey on Knowledge Distillation of... |
|
Emerging |
| 17 |
yang-ai-lab/OSF-Open-Sleep-FM
OSF: On Pre-training and Scaling of Sleep Foundation Models |
|
Emerging |
| 18 |
ROIM1998/APT
[ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models... |
|
Emerging |
| 19 |
pittisl/GreenTrainer
Code for paper "Towards Green AI in Fine-tuning Large Language Models via... |
|
Emerging |
| 20 |
pdaicode/awesome-LLMs-finetuning
Collection of resources for finetuning Large Language Models (LLMs). |
|
Emerging |
| 21 |
iboing/CorDA
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models... |
|
Emerging |
| 22 |
teilomillet/retrain
a Python library that uses Reinforcement Learning (RL) to train LLMs. |
|
Experimental |
| 23 |
Qwen-Applications/STAR
STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function... |
|
Experimental |
| 24 |
TamSiuhin/OPPU
Official Implementation of "Democratizing Large Language Models via... |
|
Experimental |
| 25 |
wshi83/MedAdapter
[EMNLP'24] MedAdapter: Efficient Test-Time Adaptation of Large Language... |
|
Experimental |
| 26 |
hemantjuyal/LLM-Distillation-Lab
An experiment demonstrating instruction-following distillation, enabling the... |
|
Experimental |
| 27 |
XelfXendr/peft_unlearning
Repository exploring the use of parameter-efficient finetuning methods for... |
|
Experimental |
| 28 |
amazon-science/mezo_svrg
Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for... |
|
Experimental |
| 29 |
waelantar/ATTS_Complete_Free_Package
ATTS: Adaptive Test-Time Scaling - A validated framework for optimizing LLM... |
|
Experimental |
| 30 |
amazon-science/mada_optimizer_search
Code the ICML 2024 paper: "MADA: Meta-Adaptive Optimizers through... |
|
Experimental |
| 31 |
BaohaoLiao/mefts
[NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to... |
|
Experimental |
| 32 |
kriskrisliu/PAT
[AAAI 2025] PAT: Pruning-Aware Tuning for Large Language Models |
|
Experimental |
| 33 |
liuyz0/DepthScaling
Inverse Depth Scaling From Most Layers Being Similar |
|
Experimental |
| 34 |
EM7m4/Distill-R1
Combine reinforcement learning with online teacher-student distillation to... |
|
Experimental |
| 35 |
ikun-llm/ikun-Distill
知识蒸馏 | Knowledge Distillation from teacher model 🎓 |
|
Experimental |
| 36 |
hendrik-spl/sustainable-llm-knowledge-distillation
Resource-efficient LLM distillation: Improving sustainability and reducing... |
|
Experimental |
| 37 |
OptimAI-Lab/RoSTE
[ICML 2025] Official code for the paper "RoSTE: An Efficient... |
|
Experimental |
| 38 |
gumran/post-training
Easy instruction tuning and preference tuning of LLMs using the TRL library. |
|
Experimental |