keonlee9420/DiffGAN-TTS

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

43
/ 100
Emerging

DiffGAN-TTS helps creators, educators, and content producers transform written text into high-quality, natural-sounding spoken audio. You input text, and it generates audio files of a single speaker or multiple speakers, with options to control elements like pitch and speaking rate. This is ideal for anyone who needs to quickly create voiceovers or spoken content from text.

347 stars. No commits in the last 6 months.

Use this if you need to generate realistic, high-fidelity speech from text for single or multiple speakers, with some control over vocal characteristics.

Not ideal if you require real-time speech synthesis for interactive applications, as this is geared towards generating audio files.

text-to-speech voice-generation audiobook-creation elearning-content content-localization
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

347

Forks

44

Language

Python

License

MIT

Last pushed

Feb 21, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/keonlee9420/DiffGAN-TTS"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.