gheyret/thuyg20_scripts
Script files of THUYG-20(A free Uyghur speech database Released by CSLT@Tsinghua University & Xinjiang University)
This provides standard Uyghur Latin script files to accompany the THUYG-20 speech database. It takes the original non-standard scripts from the THUYG-20 audio files and converts them into widely recognized Uyghur Latin characters. This resource is for speech technology researchers and developers working on Uyghur speech recognition.
No commits in the last 6 months.
Use this if you are developing an Uyghur speech recognition system and need the THUYG-20 speech database transcripts in standard Uyghur Latin script.
Not ideal if you are looking for the actual audio files of the THUYG-20 database, as this only provides the corrected script files.
Stars
19
Forks
2
Language
—
License
—
Category
Last pushed
Mar 02, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/gheyret/thuyg20_scripts"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
qianchang/zici
字词:收集国学/汉语字词拼音相关资源
gheyret/UQSpeechDataset
Uyghur Single Speaker Speech Dataset. ウイグル語音声データセット
speechio/BigCiDian
Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.
apluka34/Bud500
Bud500: A Comprehensive Vietnamese ASR Dataset
harisbinzia/PronouncUR
PronouncUR: An Urdu Pronunciation Lexicon Generator