vb000/Waveformer
A deep neural network architecture for low-latency audio processing
This tool helps audio engineers, sound designers, and researchers isolate specific sounds from a mixed audio file in real time. You input an audio recording containing multiple sounds, specify the target sound you want to extract (like "Computer keyboard" or "Bark"), and it outputs a new audio file with only the requested sound, quickly and with minimal delay.
323 stars. No commits in the last 6 months.
Use this if you need to extract individual sounds from complex audio mixtures with very low processing delay, such as for live audio applications or real-time monitoring.
Not ideal if you need a solution that doesn't require a Python development environment or if you primarily work with pre-recorded, non-streaming audio where real-time performance isn't critical.
Stars
323
Forks
35
Language
Python
License
MIT
Category
Last pushed
Aug 15, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/vb000/Waveformer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz...
drethage/speech-denoising-wavenet
A neural network for end-to-end speech denoising
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
iver56/torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
lmnt-com/wavegrad
A fast, high-quality neural vocoder.