lmnt-com/wavegrad
A fast, high-quality neural vocoder.
This tool helps turn audio descriptions, like a log-scaled Mel spectrogram, into realistic speech audio files. It takes a compressed representation of sound and reconstructs the full, high-fidelity waveform. Voice AI researchers, audio engineers, or developers building text-to-speech systems would use this to generate natural-sounding speech from spectral data.
296 stars. No commits in the last 6 months.
Use this if you need to convert Mel spectrograms into high-quality, natural-sounding audio waveforms for speech generation.
Not ideal if you're looking for a simple text-to-speech solution without needing to work with spectrogram data directly or if your primary need is general audio processing beyond speech.
Stars
296
Forks
51
Language
Python
License
Apache-2.0
Category
Last pushed
Jul 18, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/lmnt-com/wavegrad"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz...
drethage/speech-denoising-wavenet
A neural network for end-to-end speech denoising
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
iver56/torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
madhavmk/Noise2Noise-audio_denoising_without_clean_training_data
Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise...