gheyret/UQSpeechDataset

Uyghur Single Speaker Speech Dataset. ウイグル語音声データセット

40
/ 100
Emerging

This dataset offers a collection of Uyghur speech recordings paired with their corresponding text in multiple scripts. It provides over 28 hours of audio, with each segment being 10 seconds or less, along with text in Uyghur Arabic, Latin, and Slavic scripts. This is designed for researchers and developers working on building Text-to-Speech systems for the Uyghur language.

No commits in the last 6 months.

Use this if you are developing or training machine learning models for converting written Uyghur text into spoken audio.

Not ideal if you need a dataset for tasks other than text-to-speech, such as speech recognition or natural language understanding, or if you require longer, continuous speech segments.

Uyghur language processing Text-to-Speech development Speech synthesis Language technology research Digital humanities
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

34

Forks

8

Language

License

MIT

Last pushed

Apr 03, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/gheyret/UQSpeechDataset"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.