apple/ml-spatial-librispeech

A large synthetic dataset of spatial audio with multiple labels

/ 100

Emerging

This project provides a large collection of synthetic spatial audio recordings designed to train machine learning models. It takes standard speech samples and augments them with realistic room acoustics and sound source positions. The output is ambisonic audio, accompanied by labels indicating where sounds are coming from, where speakers are facing, and details about the simulated room. This dataset is for audio engineers and researchers developing AI models for tasks like sound localization or spatial audio processing.

125 stars. No commits in the last 6 months.

Use this if you need a comprehensive dataset of spatial audio to train machine learning models for sound source localization, speech enhancement, or virtual acoustics applications.

Not ideal if you are looking for a simple tool to process existing audio files or if you require real-world, non-synthetic spatial audio recordings.

spatial-audio machine-learning-datasets acoustic-modeling audio-research sound-localization

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 10 / 25

How are scores calculated?

Stars

125

Forks

Language

—

License

—

Higher-rated alternatives

hstsethi/in-mob-prefix

Dataset, charts, models of 4 digit mobile number prefixes in India by state, operator name.

Nexdata-AI/359-Hours-Indonesian-Speech-Data-by-Mobile-Phone_Reading

Indonesian Speech Dataset

Nexdata-AI/207-Hours-Japanese-Speaking-English-Speech-Data-by-Mobile-Phone

Japanese Speaking English Speech Dataset

Nexdata-AI/98-Hours-Taiwan-Mandarin-Speech-Data-by-Mobile-Phone_Reading

Taiwan Speech Dataset

Nexdata-AI/607-Hours-Cantonese-Conversational-Speech-Data-by-Mobile-Phone-and-Voice-Recorder

Cantonese Conversational Speech Dataset

Explore ML Frameworks

All categories Trending ML Framework directory Insights