Rumeysakeskin/Speech-Datasets-for-ASR
Download speech datasets (English and non-English) for Automatic Speech Recognition
This helps developers working on speech technology projects find and download free audio datasets for various languages. It provides direct links to readily available speech corpora, including both English and non-English options. Developers of automatic speech recognition (ASR) systems or other speech-related applications will find this useful for acquiring training data.
No commits in the last 6 months.
Use this if you need free, publicly available audio datasets to train or test your automatic speech recognition (ASR) models or other speech-processing applications.
Not ideal if you need highly specialized audio data not covered by common public datasets, or if you require proprietary or commercially licensed speech corpora.
Stars
15
Forks
1
Language
Jupyter Notebook
License
—
Category
Last pushed
Jan 22, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Rumeysakeskin/Speech-Datasets-for-ASR"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ynop/audiomate
Python library for handling audio datasets.
reazon-research/ReazonSpeech
Massive open Japanese speech corpus
common-voice/cv-dataset
Metadata and versioning details for the Common Voice dataset
davidmartinrius/speech-dataset-generator
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset...
EgorLakomkin/KTSpeechCrawler
Automatically constructing corpus for automatic speech recognition from YouTube videos