lancopku/DynamicKD
Code for EMNLP 2021 main conference paper "Dynamic Knowledge Distillation for Pre-trained Language Models"
When developing specialized AI models for language tasks, you often start by training a large, powerful "teacher" model and then distilling its knowledge into a smaller, faster "student" model. This project provides methods to dynamically adjust how that knowledge transfer happens. It takes your pre-trained teacher and student language models and helps the student learn more effectively, resulting in a more efficient, high-performing smaller model.
No commits in the last 6 months.
Use this if you are a machine learning engineer or researcher looking to create smaller, more efficient natural language processing models without sacrificing too much performance compared to large pre-trained models.
Not ideal if you are looking for a ready-to-use, off-the-shelf NLP model without needing to engage in model training or fine-tuning.
Stars
41
Forks
6
Language
Python
License
MIT
Category
Last pushed
Aug 09, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/lancopku/DynamicKD"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
airaria/TextBrewer
A PyTorch-based knowledge distillation toolkit for natural language processing
sunyilgdx/NSP-BERT
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original...
princeton-nlp/CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
kssteven418/LTP
[KDD'22] Learned Token Pruning for Transformers
georgian-io/Transformers-Domain-Adaptation
:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains