rishikksh20/VocGAN

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

47
/ 100
Emerging

This tool helps researchers and developers in speech synthesis convert audio features, specifically mel spectrograms, back into high-fidelity, real-time audio. It takes processed audio features as input and outputs natural-sounding speech. This is ideal for those creating or experimenting with text-to-speech systems or voice cloning.

321 stars. No commits in the last 6 months.

Use this if you need to quickly generate high-quality speech audio from mel spectrograms for your speech synthesis projects.

Not ideal if you're looking for a complete, end-to-end text-to-speech system that handles both text processing and audio generation.

speech-synthesis voice-generation audio-reconstruction text-to-speech
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 21 / 25

How are scores calculated?

Stars

321

Forks

59

Language

Python

License

MIT

Last pushed

Jul 25, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/rishikksh20/VocGAN"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.