MML-Group/code4AVE-Speech
Source Code for AVE Speech Dataset
This project offers a comprehensive Mandarin speech corpus, AVE Speech, which provides synchronized audio, lip video, and surface electromyography (EMG) signals. It helps researchers develop and test robust speech recognition systems by offering 55+ hours of multi-modal data from 100 native speakers. Speech recognition researchers can use this dataset to train and evaluate models that analyze various types of input.
No commits in the last 6 months.
Use this if you are a speech recognition researcher or scientist looking for a large-scale, multi-modal Mandarin speech dataset to train advanced models.
Not ideal if you are looking for a pre-built, ready-to-deploy speech recognition application rather than a dataset for research and development.
Stars
12
Forks
—
Language
Python
License
MIT
Category
Last pushed
Aug 28, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/MML-Group/code4AVE-Speech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
Uberi/speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
cmusphinx/pocketsphinx
A small speech recognizer
tensorflow/lingvo
Lingvo
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models,...
PyThaiNLP/pythaiasr
Python Thai Automatic Speech Recognition