ritaranx/NeST
[AAAI 2023] This is the code for our paper `Neighborhood-Regularized Self-Training for Learning with Few Labels'.
This project helps data scientists, machine learning engineers, and researchers classify text documents efficiently when labeled data is scarce. You provide a small set of labeled text documents and a larger pool of unlabeled documents, and it outputs a model capable of accurately categorizing new, unseen text. This is ideal for anyone working on text classification tasks across various domains.
No commits in the last 6 months.
Use this if you need to classify documents into categories but have very few examples for each category and a lot of unclassified text.
Not ideal if you have a large, well-labeled dataset already, as this method is specifically designed for low-resource scenarios.
Stars
12
Forks
1
Language
Python
License
MIT
Category
Last pushed
Jan 11, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/ritaranx/NeST"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
airaria/TextBrewer
A PyTorch-based knowledge distillation toolkit for natural language processing
sunyilgdx/NSP-BERT
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original...
kssteven418/LTP
[KDD'22] Learned Token Pruning for Transformers
princeton-nlp/CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
georgian-io/Transformers-Domain-Adaptation
:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains