yueyu1030/ReGen

[ACL'23 Findings] This is the code repo for our ACL'23 Findings paper "ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval".

32
/ 100
Emerging

This tool helps categorize text documents like news articles, product reviews, or Wikipedia entries, even for categories you haven't explicitly trained on. You provide a collection of unlabeled text and a set of predefined categories, and it outputs classified documents. It's ideal for data analysts, content managers, or researchers who need to sort large volumes of text without extensive manual labeling.

No commits in the last 6 months.

Use this if you need to classify large amounts of text into categories but lack enough pre-labeled examples to train a traditional classifier from scratch.

Not ideal if you require a very high degree of precision for highly nuanced or safety-critical text classification, as zero-shot methods can sometimes introduce errors.

text classification content categorization document analysis data labeling sentiment analysis
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 10 / 25

How are scores calculated?

Stars

24

Forks

3

Language

Python

License

MIT

Last pushed

Sep 08, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/yueyu1030/ReGen"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.