LiteSSLHub/DisCo

This is the public repository of EMNLP 2023 paper "DisCo: Co-training Distilled Student Models for Semi-supervised Text Mining"

/ 100

Experimental

DisCo helps data scientists and NLP researchers perform text mining tasks like classification or summarization using significantly smaller and faster models. It takes your existing labeled and unlabeled text data, along with a larger pre-trained language model, and outputs multiple fine-tuned, lightweight models. These models achieve comparable performance to their larger counterparts but require less computational power and inference time.

No commits in the last 6 months.

Use this if you need to deploy text mining models on resource-constrained environments or want to speed up inference without sacrificing significant performance, especially when labeled data is limited.

Not ideal if you prioritize maximum possible accuracy above all else and have abundant computational resources and labeled data, as larger, un-distilled models might offer marginal performance gains.

text-classification extractive-summarization natural-language-processing model-optimization semi-supervised-learning

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

airaria/TextBrewer

A PyTorch-based knowledge distillation toolkit for natural language processing

sunyilgdx/NSP-BERT

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original...

kssteven418/LTP

[KDD'22] Learned Token Pruning for Transformers

princeton-nlp/CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

georgian-io/Transformers-Domain-Adaptation

:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains

Explore NLP Tools

All categories Trending NLP directory Insights