dpasse/extr-ds
Library to programmatically build labeled datasets for Named-Entity Recognition (NER) and Relation Extraction (RE) Machine Learning tasks
This tool helps data scientists and ML engineers create high-quality, labeled text datasets for training custom AI models. You provide raw text and a set of rules, and it automatically generates structured labels identifying specific entities (like names or places) and the relationships between them. This helps you efficiently prepare data for tasks like automatically extracting information from documents.
No commits in the last 6 months. Available on PyPI.
Use this if you need to programmatically build large, labeled text datasets for training AI models to recognize entities or relationships within text.
Not ideal if you prefer manual annotation for small datasets or if you're not comfortable defining labeling rules programmatically.
Stars
8
Forks
1
Language
Python
License
MIT
Category
Last pushed
Jun 10, 2023
Commits (30d)
0
Dependencies
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/dpasse/extr-ds"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
davidsbatista/BREDS
"Bootstrapping Relationship Extractors with Distributional Semantics" (Batista et al., 2015) in...
davidsbatista/Snowball
Implementation with some extensions of the paper "Snowball: Extracting Relations from Large...
nicolay-r/AREkit
Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing...
plkmo/BERT-Relation-Extraction
PyTorch implementation for "Matching the Blanks: Distributional Similarity for Relation Learning" paper
thunlp/FewRel
A Large-Scale Few-Shot Relation Extraction Dataset