apple/ml-spatial-librispeech

A large synthetic dataset of spatial audio with multiple labels

36
/ 100
Emerging

This project provides a large collection of synthetic spatial audio recordings designed to train machine learning models. It takes standard speech samples and augments them with realistic room acoustics and sound source positions. The output is ambisonic audio, accompanied by labels indicating where sounds are coming from, where speakers are facing, and details about the simulated room. This dataset is for audio engineers and researchers developing AI models for tasks like sound localization or spatial audio processing.

125 stars. No commits in the last 6 months.

Use this if you need a comprehensive dataset of spatial audio to train machine learning models for sound source localization, speech enhancement, or virtual acoustics applications.

Not ideal if you are looking for a simple tool to process existing audio files or if you require real-world, non-synthetic spatial audio recordings.

spatial-audio machine-learning-datasets acoustic-modeling audio-research sound-localization
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 10 / 25

How are scores calculated?

Stars

125

Forks

8

Language

License

Last pushed

Oct 25, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/apple/ml-spatial-librispeech"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.