ammarasmro/Kurdish-Language
Applications of NLP on the Kurdish language
This project helps linguists and language technology developers working with the Kurdish language. It takes raw Kurdish audio files (like .sph or .wav) and their corresponding transcripts, then processes them to create structured datasets. The output is a JSON representation ready for training speech recognition models.
No commits in the last 6 months.
Use this if you are a researcher or developer focused on building speech recognition systems specifically for the Kurdish language and need to prepare audio and text data.
Not ideal if you are looking for a ready-to-use speech recognition application, as this project focuses on data preparation and model training infrastructure.
Stars
8
Forks
2
Language
HTML
License
—
Category
Last pushed
Jul 21, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/ammarasmro/Kurdish-Language"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
tihu-nlp/tihu
Persian Text-To-Speech
persiandataset/PersianSpeech
Persian ASR dataset
MahtaFetrat/ManaTTS-Persian-Speech-Dataset
ManaTTS is the largest open Persian speech dataset with 114+ hours of transcribed audio....
mmahdibarghi/finglish-dataset
Persian to Finglish dataset with all the sentences voice for TTS dataset used to train tacotron2
MahtaFetrat/VirgoolInformal-Speech-Dataset
A dataset of informal Persian audio and text chunks, along with a fully open processing...