LiteSSLHub/DisCo
This is the public repository of EMNLP 2023 paper "DisCo: Co-training Distilled Student Models for Semi-supervised Text Mining"
DisCo helps data scientists and NLP researchers perform text mining tasks like classification or summarization using significantly smaller and faster models. It takes your existing labeled and unlabeled text data, along with a larger pre-trained language model, and outputs multiple fine-tuned, lightweight models. These models achieve comparable performance to their larger counterparts but require less computational power and inference time.
No commits in the last 6 months.
Use this if you need to deploy text mining models on resource-constrained environments or want to speed up inference without sacrificing significant performance, especially when labeled data is limited.
Not ideal if you prioritize maximum possible accuracy above all else and have abundant computational resources and labeled data, as larger, un-distilled models might offer marginal performance gains.
Stars
62
Forks
—
Language
Python
License
—
Category
Last pushed
Dec 30, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/LiteSSLHub/DisCo"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
airaria/TextBrewer
A PyTorch-based knowledge distillation toolkit for natural language processing
sunyilgdx/NSP-BERT
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original...
kssteven418/LTP
[KDD'22] Learned Token Pruning for Transformers
princeton-nlp/CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
georgian-io/Transformers-Domain-Adaptation
:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains