cvondrick/soundnet
SoundNet: Learning Sound Representations from Unlabeled Video. NIPS 2016
This project helps you understand and categorize sounds in your audio files by leveraging a pre-trained model derived from millions of unlabeled videos. You provide MP3s or other audio files, and it tells you what objects or scenes are likely present in the sound, or it extracts detailed sound features for further analysis. It's ideal for researchers or practitioners working with environmental audio, soundscapes, or multimedia content.
464 stars. No commits in the last 6 months.
Use this if you need to automatically identify objects or scenes from audio, or extract meaningful feature representations from sound for tasks like indexing, searching, or classification without extensive manual labeling.
Not ideal if you're looking for a simple, out-of-the-box application with a graphical interface, or if your primary interest is speech recognition or music analysis.
Stars
464
Forks
94
Language
Lua
License
MIT
Category
Last pushed
Oct 07, 2017
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/cvondrick/soundnet"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
iver56/audiomentations
A Python library for audio data augmentation. Useful for making audio ML models work well in the...
Rikorose/DeepFilterNet
Noise supression using deep filtering
torchsynth/torchsynth
A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.
marl/openl3
OpenL3: Open-source deep audio and image embeddings
archinetai/audio-data-pytorch
A collection of useful audio datasets and transforms for PyTorch.