khuangaf/ITRI-speech-recognition-dataset-generation

Automatic Speech Recognition Dataset Generation

34
/ 100
Emerging

This tool helps researchers, linguists, or educators create new, custom speech recognition datasets for Mandarin, especially those including Taiwanese or English speech. It takes YouTube videos, extracts relevant audio and subtitles, and processes them into a dataset suitable for training speech recognition models. The primary users are those needing specialized Mandarin speech data that isn't readily available.

No commits in the last 6 months.

Use this if you need to build a specialized Mandarin speech recognition dataset from YouTube video content, particularly if it includes Taiwanese or English speech.

Not ideal if you need an English-only speech recognition dataset, or if you prefer to manually annotate data.

speech-recognition mandarin-linguistics data-generation linguistic-research educational-technology
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 19 / 25

How are scores calculated?

Stars

37

Forks

20

Language

Jupyter Notebook

License

Last pushed

Aug 26, 2018

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/khuangaf/ITRI-speech-recognition-dataset-generation"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.