lmnt-com/wavegrad

A fast, high-quality neural vocoder.

/ 100

Emerging

This tool helps turn audio descriptions, like a log-scaled Mel spectrogram, into realistic speech audio files. It takes a compressed representation of sound and reconstructs the full, high-fidelity waveform. Voice AI researchers, audio engineers, or developers building text-to-speech systems would use this to generate natural-sounding speech from spectral data.

296 stars. No commits in the last 6 months.

Use this if you need to convert Mel spectrograms into high-quality, natural-sounding audio waveforms for speech generation.

Not ideal if you're looking for a simple text-to-speech solution without needing to work with spectrogram data directly or if your primary need is general audio processing beyond speech.

speech-synthesis voice-ai audio-generation signal-processing text-to-speech

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

296

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

descriptinc/descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz...

drethage/speech-denoising-wavenet

A neural network for end-to-end speech denoising

YuanGongND/ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

iver56/torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

madhavmk/Noise2Noise-audio_denoising_without_clean_training_data

Source code for the paper titled "Speech Denoising without Clean Training Data: a Noise2Noise...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights