descriptinc/melgan-neurips

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis

51
/ 100
Established

This project helps convert text-to-speech by taking a 'mel-spectrogram' – a visual representation of sound frequencies over time – and transforming it into clear, natural-sounding audio waveforms. This allows for the creation of synthetic speech from text inputs. It is designed for researchers and engineers working on speech synthesis and audio generation applications.

1,037 stars. No commits in the last 6 months.

Use this if you need to quickly and efficiently convert mel-spectrograms into high-quality, coherent raw audio waveforms for speech generation or music synthesis.

Not ideal if you are looking for a pre-built, end-user application for general text-to-speech without needing to interact with mel-spectrogram representations.

speech-synthesis audio-generation voice-cloning sound-engineering AI-audio
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 25 / 25

How are scores calculated?

Stars

1,037

Forks

213

Language

Python

License

MIT

Last pushed

Aug 28, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/descriptinc/melgan-neurips"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.