mova-institute/zoloto
розмічений руками морфо’, синт’, кореф’ корпус української мови
This project is the foundational 'gold standard' corpus for the Ukrainian language. It provides richly annotated text data that details the morphological, syntactic, and coreferential structures of Ukrainian sentences. Linguists and NLP researchers use this to develop and refine language models, getting structured linguistic data from raw Ukrainian text.
No commits in the last 6 months.
Use this if you are a computational linguist or NLP researcher working on Ukrainian language processing and need a high-quality, manually annotated dataset for training or evaluating models.
Not ideal if you are looking for a pre-built, stable version of the Universal Dependencies treebank for immediate use; in that case, refer to the published UD_Ukrainian-IU version.
Stars
27
Forks
2
Language
—
License
—
Category
Last pushed
Aug 02, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/mova-institute/zoloto"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
nert-nlp/streusle
STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)
bretttolbert/verbecc
Verbe Complete Conjugator (verbecc) supports Catalan, Spanish, French, Italian, Portuguese and...
natasha/yargy
Rule-based facts extraction for Russian language
bjascob/LemmInflect
A python module for English lemmatization and inflection.
google-research/turkish-morphology
A two-level morphological analyzer for Turkish.