ivanvovk/WaveGrad

Implementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch.

44
/ 100
Emerging

This project helps generate high-quality, natural-sounding speech from mel-spectrograms, which are visual representations of audio. It takes your pre-processed speech data (as mel-spectrograms) and outputs audio waveforms, useful for tasks like text-to-speech systems or voice synthesis. Anyone working with audio synthesis or generating lifelike voices from spectral data, such as researchers in speech technology or developers building voice assistants, would find this valuable.

408 stars. No commits in the last 6 months.

Use this if you need to convert mel-spectrograms into high-fidelity audio waveforms efficiently, especially when quick generation with fewer computational steps is important.

Not ideal if you are looking for an all-in-one text-to-speech solution that handles both text processing and audio generation, as this tool focuses specifically on the vocoder step.

speech-synthesis voice-generation audio-processing text-to-speech digital-audio
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

408

Forks

53

Language

Jupyter Notebook

License

BSD-3-Clause

Last pushed

Jul 07, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/ivanvovk/WaveGrad"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.