Knowledge Distillation Compression NLP Tools

Tools and methods for distilling large NLP models into smaller, faster models through knowledge transfer, model compression, and pruning techniques. Does NOT include general model optimization, quantization-only approaches, or unrelated NLP applications.

There are 50 knowledge distillation compression tools tracked. The highest-rated is airaria/TextBrewer at 48/100 with 1,697 stars.

Get all 50 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=knowledge-distillation-compression&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 airaria/TextBrewer

A PyTorch-based knowledge distillation toolkit for natural language processing

48
Emerging
2 sunyilgdx/NSP-BERT

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through...

45
Emerging
3 princeton-nlp/CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models...

44
Emerging
4 kssteven418/LTP

[KDD'22] Learned Token Pruning for Transformers

44
Emerging
5 georgian-io/Transformers-Domain-Adaptation

:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains

41
Emerging
6 qiangsiwei/bert_distill

BERT distillation(基于BERT的蒸馏实验 )

41
Emerging
7 microsoft/LiST

Lite Self-Training

39
Emerging
8 LiyuanLucasLiu/LD-Net

Language Model Pruning for Sequence Labeling

38
Emerging
9 mit-han-lab/neurips-micronet

[JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion

38
Emerging
10 KarineAyrs/knowledge-distillation-semantic-search

KDSS is the framework for knowledge distillation from LLMs

38
Emerging
11 Alibaba-NLP/MultilangStructureKD

[ACL 2020] Structure-Level Knowledge Distillation For Multilingual Sequence Labeling

38
Emerging
12 cambridgeltl/mirror-bert

[EMNLP'21] Mirror-BERT: Converting Pretrained Language Models to universal...

37
Emerging
13 cambridgeltl/MirrorWiC

[CoNLL'21] MirrorWiC: On Eliciting Word-in-Context Representationsfrom...

36
Emerging
14 lancopku/DynamicKD

Code for EMNLP 2021 main conference paper "Dynamic Knowledge Distillation...

36
Emerging
15 PrithivirajDamodaran/Alt-ZSC

Alternate Implementation for Zero Shot Text Classification: Instead of...

35
Emerging
16 INK-USC/sparse-distillation

Code for "Sparse Distillation: Speeding Up Text Classification by Using...

35
Emerging
17 elephantmipt/bert-distillation

Distillation of BERT model with catalyst framework

35
Emerging
18 alinlab/MASKER

MASKER: Masked Keyword Regularization for Reliable Text Classification (AAAI 2021)

33
Emerging
19 JingqingZ/KG4ZeroShotText

Source code of the paper 'Integrating Semantic Knowledge to Tackle Zero-shot...

33
Emerging
20 wmkouw/ssa-nlp

Sequential subspace alignment for temporal domain adaptation in natural...

32
Emerging
21 gyunggyung/DistilKoBiLSTM

Distilling Task-Specific Knowledge from Teacher Model into BiLSTM

32
Emerging
22 kiankd/corel2019

Code for AAAI 2019 Network Interpretability workshop paper

31
Emerging
23 albertan017/HICL

The official implementation of the paper HICL: Hashtag-Driven In-Context...

31
Emerging
24 xv44586/Knowledge-Distillation-NLP

some demos of Knowledge Distillation in NLP

30
Emerging
25 TheLucasSchwarz/zeroshotENGINE

zeroshot-engine: Zero-Shot Text Classification with LLMs in Python

30
Emerging
26 foxar124/distillery

like homebrew but with less fizz. install binaries as fast and as easy as...

29
Experimental
27 sunprinceS/Hierarchical-Attention-Model

:page_facing_up: HierAttModel for Question Answering

27
Experimental
28 ritaranx/NeST

[AAAI 2023] This is the code for our paper `Neighborhood-Regularized...

27
Experimental
29 cheneydon/efficient-bert

This repository contains the code for the paper in Findings of EMNLP 2021:...

26
Experimental
30 roeeaharoni/unsupervised-domain-clusters

Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters...

24
Experimental
31 cheneydon/hrkd

This repository contains the code for the paper in EMNLP 2021: "HRKD:...

24
Experimental
32 alexandra-chron/hierarchical-domain-adaptation

Code of NAACL 2022 "Efficient Hierarchical Domain Adaptation for Pretrained...

24
Experimental
33 amazon-science/wqa-cerberus

[EMNLP 2022 (Long, Findings)] CERBERUS: Multi-head Student Model to distill...

20
Experimental
34 AdrianBZG/SFAVEL

[ICLR 2024] Unsupervised Pretraining for Fact Verification by Language Model...

20
Experimental
35 yzhan238/PIEClass

The source code used for paper "PIEClass: Weakly-Supervised Text...

19
Experimental
36 yashmanne/intra-distillation

Repository aiming to reproduce EMNLP 2022 paper "The Importance of Being...

18
Experimental
37 Shawn-Guo-CN/Multiple-Generation-Based-Knowledge-Distillation

Multiple Generation Based Knowledge Distillation: A Roadmap

18
Experimental
38 domiwk/didots

This is the repository for the paper "DiDOTS: Knowledge Distillation from...

17
Experimental
39 Md-Emon-Hasan/DistilBERT-model-with-HF-Transformer

📝 DistilBERT, a lightweight Transformer model from Hugging Face, for various...

17
Experimental
40 LiteSSLHub/DisCo

This is the public repository of EMNLP 2023 paper "DisCo: Co-training...

16
Experimental
41 CogComp/Benchmarking-Zero-shot-Text-Classification

Code for EMNLP2019 paper : "Benchmarking zero-shot text classification:...

14
Experimental
42 cloneofsimo/zeroshot-storytelling

Github repository for Zero Shot Visual Storytelling

14
Experimental
43 xuanzebi/Paper-Knowledge_Distillation-Adversarial_Training-NLP

some paper of Knowledge Distillation and Adversarial Training about NLP

13
Experimental
44 JunhoKim94/TutorKD

Tutoring Helps Students Learn Better: Improving Knowledge Distillation for...

12
Experimental
45 tgargiani/Adaptive-Boundary

Metric Learning and Adaptive Boundary for Out-of-Domain Detection (NLDB 2022)

12
Experimental
46 leszkolukasz/training-1.58bit-llms-via-distillation

Repository for mini-paper "Training 1.58bit LLMs via Distillation"

11
Experimental
47 Ajax0564/Transformer-NLP

This is all you need for NLP transformer training and knowledge distilation

11
Experimental
48 YunHaaaa/UROP

NLP, Knowledge Distillation, pruning

11
Experimental
49 MidiyaZhu/MICL

Code for Logit Separability-Driven Samples and Multiple Class-Related Words...

11
Experimental
50 Smu-Tan/ZS-NMT-Variations

[EMNLP2023] Towards a Better Understanding of Variations in Zero-Shot Neural...

10
Experimental