mova-institute/zoloto

розмічений руками морфо’, синт’, кореф’ корпус української мови

/ 100

Experimental

This project is the foundational 'gold standard' corpus for the Ukrainian language. It provides richly annotated text data that details the morphological, syntactic, and coreferential structures of Ukrainian sentences. Linguists and NLP researchers use this to develop and refine language models, getting structured linguistic data from raw Ukrainian text.

No commits in the last 6 months.

Use this if you are a computational linguist or NLP researcher working on Ukrainian language processing and need a high-quality, manually annotated dataset for training or evaluating models.

Not ideal if you are looking for a pre-built, stable version of the Universal Dependencies treebank for immediate use; in that case, refer to the published UD_Ukrainian-IU version.

computational linguistics natural language processing Ukrainian language corpus annotation linguistic research

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

—

License

—

Higher-rated alternatives

nert-nlp/streusle

STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)

bretttolbert/verbecc

Verbe Complete Conjugator (verbecc) supports Catalan, Spanish, French, Italian, Portuguese and...

natasha/yargy

Rule-based facts extraction for Russian language

bjascob/LemmInflect

A python module for English lemmatization and inflection.

google-research/turkish-morphology

A two-level morphological analyzer for Turkish.

Explore NLP Tools

All categories Trending NLP directory Insights