decile-team/spear

SPEAR: Programmatically label and build training data quickly.

44
/ 100
Emerging

Building machine learning models often requires a lot of labeled data, which can be time-consuming and expensive to create. This tool helps you quickly generate training data by defining simple rules or heuristics, even if your existing data is mostly unlabeled. It takes your raw, unlabeled data and a set of labeling rules, then outputs high-quality, programmatically labeled datasets ready for training. This is ideal for data scientists and ML engineers looking to accelerate data preparation for their models.

109 stars. No commits in the last 6 months.

Use this if you need to rapidly create labeled datasets for machine learning models from large amounts of raw or weakly labeled data, without extensive manual annotation.

Not ideal if you require highly precise, human-expert-level labels for every single data point and have the resources for extensive manual annotation.

data-labeling machine-learning-engineering natural-language-processing data-preparation weak-supervision
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

109

Forks

22

Language

Jupyter Notebook

License

MIT

Last pushed

Jun 27, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/decile-team/spear"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.