Speech Recognition Datasets Voice AI Tools
There are 17 speech recognition datasets tools tracked. 1 score above 50 (established tier). The highest-rated is double22a/speech_dataset at 54/100 with 453 stars.
Get all 17 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=voice-ai&subcategory=speech-recognition-datasets&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
double22a/speech_dataset
The dataset of Speech Recognition |
|
Established |
| 2 |
Jakobovski/free-spoken-digit-dataset
A free audio dataset of spoken digits. An audio version of MNIST. |
|
Emerging |
| 3 |
Ijwi-ry-Ikirundi-AI/Kirundi_Dataset
🇧🇮 The first large-scale, open-source speech and text dataset for Kirundi... |
|
Emerging |
| 4 |
lottev1991/Project-AIdol-Public-English-Dataset
Public female English corpus used for Project AI❤dol |
|
Emerging |
| 5 |
Jahangirbd23/WenetSpeech-Yue
📑 Explore WenetSpeech-Yue, a comprehensive Cantonese speech corpus with rich... |
|
Experimental |
| 6 |
Nexdata-AI/338-Hours-Spanish-Speech-Data-by-Mobile-Phone
Spanish Speech Dataset |
|
Experimental |
| 7 |
Nexdata-AI/1000-Hours-Filipino-Speaking-English-Speech-Data-by-Mobile-Phone
Filipino English Speech Dataset |
|
Experimental |
| 8 |
Nexdata-AI/20-Hours-American-English-Speech-Synthesis-Corpus-Male
American English Speech Synthesis Corpus |
|
Experimental |
| 9 |
Nexdata-AI/50-Hours-American-Children-Speech-Data-by-Microphone
American Children Speech Dataset |
|
Experimental |
| 10 |
Nexdata-AI/548-Hours-Taiwanese-Accent-Mandarin-Spontaneous-Speech-Data
Taiwanese Accent Mandarin Spontaneous Speech Data |
|
Experimental |
| 11 |
Nexdata-AI/155-People-Malay-Speech-Data-by-Mobile-Phone_Guiding
Malay Speech Dataset |
|
Experimental |
| 12 |
Nexdata-AI/760-Hours-Hindi-Conversational-Speech-Data-by-Telephone
760 Hours - Hindi Conversational Speech Data by Telephone |
|
Experimental |
| 13 |
Nexdata-AI/357-Hours-Korean-Speech-Data-by-Mobile-Phone
Korean Speech Dataset |
|
Experimental |
| 14 |
Nexdata-AI/10-Hours-Far-filed-Noise-Speech-Data-in-Home-Environment-by-Mic-Array
Far-filed Noise Speech Dataset |
|
Experimental |
| 15 |
Nexdata-AI/201-People-Infant-Cry-Speech-Data-by-Mobile-Phone
Infant Cry Speech Dataset |
|
Experimental |
| 16 |
Nexdata-AI/500-Hours-Korean-Conversational-Speech-Data-by-Mobile-Phone
The dataset of Korean conversational speech |
|
Experimental |
| 17 |
Nexdata-AI/520-Hours-French-Speaking-English-Speech-Data-by-Mobile-Phone
French Speech Dataset |
|
Experimental |