microsoft/ASTRA

Self-training with Weak Supervision (NAACL 2021)

41
/ 100
Emerging

This framework helps data scientists and machine learning engineers create robust classification models faster when manually labeling large datasets is too expensive. By combining domain-specific rules, a small amount of labeled data, and a large pool of unlabeled data, it automatically generates high-quality weak labels. The output is a trained deep neural network capable of accurately classifying new instances.

163 stars. No commits in the last 6 months.

Use this if you need to train a classification model but lack sufficient manually labeled data and can define some heuristic rules for your domain.

Not ideal if you already have large-scale, high-quality labeled datasets or if your task doesn't easily lend itself to rule-based weak supervision.

natural-language-processing data-labeling text-classification machine-learning-engineering low-resource-data
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

163

Forks

21

Language

Python

License

MIT

Last pushed

Jul 24, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/microsoft/ASTRA"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.