yueyu1030/ReGen

[ACL'23 Findings] This is the code repo for our ACL'23 Findings paper "ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval".

/ 100

Emerging

This tool helps categorize text documents like news articles, product reviews, or Wikipedia entries, even for categories you haven't explicitly trained on. You provide a collection of unlabeled text and a set of predefined categories, and it outputs classified documents. It's ideal for data analysts, content managers, or researchers who need to sort large volumes of text without extensive manual labeling.

No commits in the last 6 months.

Use this if you need to classify large amounts of text into categories but lack enough pre-labeled examples to train a traditional classifier from scratch.

Not ideal if you require a very high degree of precision for highly nuanced or safety-critical text classification, as zero-shot methods can sometimes introduce errors.

text classification content categorization document analysis data labeling sentiment analysis

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 10 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

n-waves/multifit

The code to reproduce results from paper "MultiFiT: Efficient Multi-lingual Language Model...

princeton-nlp/SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

yxuansu/SimCTG

[NeurIPS'22 Spotlight] A Contrastive Framework for Neural Text Generation

alibaba-edu/simple-effective-text-matching

Source code of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

Shark-NLP/OpenICL

OpenICL is an open-source framework to facilitate research, development, and prototyping of...

Explore NLP Tools

All categories Trending NLP directory Insights