Llm Knowledge Distillation Transformer Models

There are 38 llm knowledge distillation models tracked. 3 score above 50 (established tier). The highest-rated is scaleapi/llm-engine at 63/100 with 821 stars. 2 of the top 10 are actively maintained.

Get all 38 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-knowledge-distillation&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 scaleapi/llm-engine

Scale LLM Engine public repository

63
Established
2 AGI-Arena/MARS

The official implementation of MARS: Unleashing the Power of Variance...

54
Established
3 modelscope/easydistill

a toolkit on knowledge distillation for large language models

50
Established
4 AGI-Edgerunners/LLM-Adapters

Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for...

45
Emerging
5 Wang-ML-Lab/bayesian-peft

Bayesian Low-Rank Adaptation of LLMs: BLoB [NeurIPS 2024] and TFB [NeurIPS 2025]

45
Emerging
6 sangmichaelxie/doremi

Pytorch implementation of DoReMi, a method for optimizing the data mixture...

42
Emerging
7 ZO-Bench/ZO-LLM

[ICML‘24] Official code for the paper "Revisiting Zeroth-Order Optimization...

42
Emerging
8 Liuhong99/Sophia

The official implementation of “Sophia: A Scalable Stochastic Second-order...

41
Emerging
9 ShiZhengyan/DePT

[ICLR 2024] This is the repository for the paper titled "DePT: Decomposed...

40
Emerging
10 YJiangcm/Lion

[EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models

39
Emerging
11 golololologol/LLM-Distillery

A pipeline for LLM knowledge distillation

39
Emerging
12 shufangxun/LLaVA-MoD

[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation

38
Emerging
13 horus-ai-labs/DistillFlow

Library for model distillation

37
Emerging
14 yifanzhang-pro/HLA

Official Project Page for HLA: Higher-order Linear Attention...

36
Emerging
15 OatmealLiu/FineR

[ICLR'24] Democratizing Fine-grained Visual Recognition with Large Language Models

35
Emerging
16 Tebmer/Awesome-Knowledge-Distillation-of-LLMs

This repository collects papers for "A Survey on Knowledge Distillation of...

34
Emerging
17 yang-ai-lab/OSF-Open-Sleep-FM

OSF: On Pre-training and Scaling of Sleep Foundation Models

33
Emerging
18 ROIM1998/APT

[ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models...

33
Emerging
19 pittisl/GreenTrainer

Code for paper "Towards Green AI in Fine-tuning Large Language Models via...

32
Emerging
20 pdaicode/awesome-LLMs-finetuning

Collection of resources for finetuning Large Language Models (LLMs).

31
Emerging
21 iboing/CorDA

CorDA: Context-Oriented Decomposition Adaptation of Large Language Models...

31
Emerging
22 teilomillet/retrain

a Python library that uses Reinforcement Learning (RL) to train LLMs.

29
Experimental
23 Qwen-Applications/STAR

STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function...

28
Experimental
24 TamSiuhin/OPPU

Official Implementation of "Democratizing Large Language Models via...

28
Experimental
25 wshi83/MedAdapter

[EMNLP'24] MedAdapter: Efficient Test-Time Adaptation of Large Language...

23
Experimental
26 hemantjuyal/LLM-Distillation-Lab

An experiment demonstrating instruction-following distillation, enabling the...

21
Experimental
27 XelfXendr/peft_unlearning

Repository exploring the use of parameter-efficient finetuning methods for...

21
Experimental
28 amazon-science/mezo_svrg

Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for...

21
Experimental
29 waelantar/ATTS_Complete_Free_Package

ATTS: Adaptive Test-Time Scaling - A validated framework for optimizing LLM...

20
Experimental
30 amazon-science/mada_optimizer_search

Code the ICML 2024 paper: "MADA: Meta-Adaptive Optimizers through...

20
Experimental
31 BaohaoLiao/mefts

[NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to...

19
Experimental
32 kriskrisliu/PAT

[AAAI 2025] PAT: Pruning-Aware Tuning for Large Language Models

18
Experimental
33 liuyz0/DepthScaling

Inverse Depth Scaling From Most Layers Being Similar

17
Experimental
34 EM7m4/Distill-R1

Combine reinforcement learning with online teacher-student distillation to...

14
Experimental
35 ikun-llm/ikun-Distill

知识蒸馏 | Knowledge Distillation from teacher model 🎓

14
Experimental
36 hendrik-spl/sustainable-llm-knowledge-distillation

Resource-efficient LLM distillation: Improving sustainability and reducing...

13
Experimental
37 OptimAI-Lab/RoSTE

[ICML 2025] Official code for the paper "RoSTE: An Efficient...

13
Experimental
38 gumran/post-training

Easy instruction tuning and preference tuning of LLMs using the TRL library.

10
Experimental

Comparisons in this category