KurdishBLARK/InterdialectCorpus

A parallel corpus of Sorani, Kurmanji and English

/ 100

Emerging

This collection of text helps language professionals and researchers by providing carefully aligned translations between Sorani, Kurmanji, and English. It takes news articles in these languages and presents them as parallel texts, so you can easily see how sentences translate across dialects and languages. Translators, linguists, and computational linguists focused on Kurdish will find this useful.

No commits in the last 6 months.

Use this if you need accurate, manually-aligned text pairs for translation work, linguistic analysis, or developing language technologies for Kurdish.

Not ideal if you require a corpus for languages other than Sorani, Kurmanji, or English, or if you need an unaligned, general text collection.

Kurdish language translation resources linguistic research language learning computational linguistics

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

—

License

—

Higher-rated alternatives

Helsinki-NLP/OpusFilter

OpusFilter - Parallel corpus processing toolkit

natasha/corus

Links to Russian corpora + Python functions for loading and parsing

SergeyShk/ruTS

Библиотека для извлечения статистик из текстов на русском языке.

darija-open-dataset/dataset

darija <-> english dataset

omicsNLP/Auto-CORPus

Auto-CORPus pipeline developed by a University of Nottingham and Imperial College London...

Explore NLP Tools

All categories Trending NLP directory Insights