gheyret/thuyg20_scripts

Script files of THUYG-20(A free Uyghur speech database Released by CSLT@Tsinghua University & Xinjiang University)

/ 100

Experimental

This provides standard Uyghur Latin script files to accompany the THUYG-20 speech database. It takes the original non-standard scripts from the THUYG-20 audio files and converts them into widely recognized Uyghur Latin characters. This resource is for speech technology researchers and developers working on Uyghur speech recognition.

No commits in the last 6 months.

Use this if you are developing an Uyghur speech recognition system and need the THUYG-20 speech database transcripts in standard Uyghur Latin script.

Not ideal if you are looking for the actual audio files of the THUYG-20 database, as this only provides the corrected script files.

Uyghur language processing speech recognition development linguistic data preparation natural language processing

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

—

License

—

Higher-rated alternatives

qianchang/zici

字词：收集国学/汉语字词拼音相关资源

gheyret/UQSpeechDataset

Uyghur Single Speaker Speech Dataset. ウイグル語音声データセット

speechio/BigCiDian

Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.

apluka34/Bud500

Bud500: A Comprehensive Vietnamese ASR Dataset

harisbinzia/PronouncUR

PronouncUR: An Urdu Pronunciation Lexicon Generator

Explore Voice AI Tools

All categories Trending Voice AI directory Insights