KurdishBLARK/KTC
Kurdish Textbooks Corpus
This is a collection of Kurdish language textbooks that have been carefully organized to help with language research. It takes raw textbook text and categorizes it by subject matter, making it easier to study Kurdish language use in educational materials. Researchers, linguists, and educators focused on the Kurdish language would find this corpus valuable.
No commits in the last 6 months.
Use this if you are a researcher or linguist needing a structured dataset of Kurdish educational texts for language analysis.
Not ideal if you are looking for a general-purpose Kurdish text corpus that includes non-academic or informal language.
Stars
8
Forks
—
Language
—
License
—
Category
Last pushed
Feb 09, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/KurdishBLARK/KTC"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Helsinki-NLP/OpusFilter
OpusFilter - Parallel corpus processing toolkit
natasha/corus
Links to Russian corpora + Python functions for loading and parsing
SergeyShk/ruTS
Библиотека для извлечения статистик из текстов на русском языке.
darija-open-dataset/dataset
darija <-> english dataset
omicsNLP/Auto-CORPus
Auto-CORPus pipeline developed by a University of Nottingham and Imperial College London...