dpasse/extr-ds

Library to programmatically build labeled datasets for Named-Entity Recognition (NER) and Relation Extraction (RE) Machine Learning tasks

37
/ 100
Emerging

This tool helps data scientists and ML engineers create high-quality, labeled text datasets for training custom AI models. You provide raw text and a set of rules, and it automatically generates structured labels identifying specific entities (like names or places) and the relationships between them. This helps you efficiently prepare data for tasks like automatically extracting information from documents.

No commits in the last 6 months. Available on PyPI.

Use this if you need to programmatically build large, labeled text datasets for training AI models to recognize entities or relationships within text.

Not ideal if you prefer manual annotation for small datasets or if you're not comfortable defining labeling rules programmatically.

natural-language-processing data-labeling information-extraction machine-learning-engineering
Stale 6m
Maintenance 0 / 25
Adoption 4 / 25
Maturity 25 / 25
Community 8 / 25

How are scores calculated?

Stars

8

Forks

1

Language

Python

License

MIT

Last pushed

Jun 10, 2023

Commits (30d)

0

Dependencies

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/dpasse/extr-ds"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.