gheyret/UQSpeechDataset
Uyghur Single Speaker Speech Dataset. ウイグル語音声データセット
This dataset offers a collection of Uyghur speech recordings paired with their corresponding text in multiple scripts. It provides over 28 hours of audio, with each segment being 10 seconds or less, along with text in Uyghur Arabic, Latin, and Slavic scripts. This is designed for researchers and developers working on building Text-to-Speech systems for the Uyghur language.
No commits in the last 6 months.
Use this if you are developing or training machine learning models for converting written Uyghur text into spoken audio.
Not ideal if you need a dataset for tasks other than text-to-speech, such as speech recognition or natural language understanding, or if you require longer, continuous speech segments.
Stars
34
Forks
8
Language
—
License
MIT
Category
Last pushed
Apr 03, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/gheyret/UQSpeechDataset"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
qianchang/zici
字词:收集国学/汉语字词拼音相关资源
speechio/BigCiDian
Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.
apluka34/Bud500
Bud500: A Comprehensive Vietnamese ASR Dataset
harisbinzia/PronouncUR
PronouncUR: An Urdu Pronunciation Lexicon Generator
jonsafari/buckeye_dict
Buckeye Pronunciation Dictionary