apple/ml-spatial-librispeech
A large synthetic dataset of spatial audio with multiple labels
This project provides a large collection of synthetic spatial audio recordings designed to train machine learning models. It takes standard speech samples and augments them with realistic room acoustics and sound source positions. The output is ambisonic audio, accompanied by labels indicating where sounds are coming from, where speakers are facing, and details about the simulated room. This dataset is for audio engineers and researchers developing AI models for tasks like sound localization or spatial audio processing.
125 stars. No commits in the last 6 months.
Use this if you need a comprehensive dataset of spatial audio to train machine learning models for sound source localization, speech enhancement, or virtual acoustics applications.
Not ideal if you are looking for a simple tool to process existing audio files or if you require real-world, non-synthetic spatial audio recordings.
Stars
125
Forks
8
Language
—
License
—
Category
Last pushed
Oct 25, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/apple/ml-spatial-librispeech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hstsethi/in-mob-prefix
Dataset, charts, models of 4 digit mobile number prefixes in India by state, operator name.
Nexdata-AI/359-Hours-Indonesian-Speech-Data-by-Mobile-Phone_Reading
Indonesian Speech Dataset
Nexdata-AI/207-Hours-Japanese-Speaking-English-Speech-Data-by-Mobile-Phone
Japanese Speaking English Speech Dataset
Nexdata-AI/98-Hours-Taiwan-Mandarin-Speech-Data-by-Mobile-Phone_Reading
Taiwan Speech Dataset
Nexdata-AI/607-Hours-Cantonese-Conversational-Speech-Data-by-Mobile-Phone-and-Voice-Recorder
Cantonese Conversational Speech Dataset