gheyret/UQSpeechDataset

Uyghur Single Speaker Speech Dataset. ウイグル語音声データセット

/ 100

Emerging

This dataset offers a collection of Uyghur speech recordings paired with their corresponding text in multiple scripts. It provides over 28 hours of audio, with each segment being 10 seconds or less, along with text in Uyghur Arabic, Latin, and Slavic scripts. This is designed for researchers and developers working on building Text-to-Speech systems for the Uyghur language.

No commits in the last 6 months.

Use this if you are developing or training machine learning models for converting written Uyghur text into spoken audio.

Not ideal if you need a dataset for tasks other than text-to-speech, such as speech recognition or natural language understanding, or if you require longer, continuous speech segments.

Uyghur language processing Text-to-Speech development Speech synthesis Language technology research Digital humanities

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

—

License

MIT

Higher-rated alternatives

qianchang/zici

字词：收集国学/汉语字词拼音相关资源

speechio/BigCiDian

Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.

apluka34/Bud500

Bud500: A Comprehensive Vietnamese ASR Dataset

harisbinzia/PronouncUR

PronouncUR: An Urdu Pronunciation Lexicon Generator

jonsafari/buckeye_dict

Buckeye Pronunciation Dictionary

Explore Voice AI Tools

All categories Trending Voice AI directory Insights