dumitrescustefan/ronec
Romanian Named Entity Corpus (RONEC) version 2.0
This dataset provides a collection of over 12,000 Romanian sentences with specific words and phrases, like names of people, places, organizations, or dates, already identified and categorized. It takes raw Romanian text and outputs the same text with annotations highlighting these 'named entities'. Anyone working with Romanian text, such as researchers in linguistics or those developing language-based software, would find this useful.
No commits in the last 6 months.
Use this if you need pre-labeled Romanian text to train or evaluate systems that automatically extract specific information like names, locations, or dates.
Not ideal if your primary need is for raw, unlabeled Romanian text or if you are working with languages other than Romanian.
Stars
68
Forks
16
Language
Python
License
MIT
Category
Last pushed
Nov 19, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/dumitrescustefan/ronec"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
MantisAI/nervaluate
Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13
dice-group/gerbil
GERBIL - General Entity annotatoR Benchmark
bltlab/seqscore
SeqScore: Scoring for named entity recognition and other sequence labeling tasks
syuoni/eznlp
Easy Natural Language Processing
LHNCBC/metamaplite
A near real-time named-entity recognizer