keonlee9420/Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

44
/ 100
Emerging

This project helps audio engineers, content creators, and AI researchers generate realistic human speech from written text. You provide text input, and it produces high-quality audio files in various voices and styles. It's designed for users who need to synthesize speech quickly and with fine-grained control over vocal characteristics.

328 stars. No commits in the last 6 months.

Use this if you need to generate high-quality, natural-sounding speech from text, especially if you require control over speaking rate, pitch, and volume, or want to experiment with different voice models.

Not ideal if you're looking for a simple, off-the-shelf text-to-speech solution without any technical setup or customization.

speech-synthesis audio-generation voice-over content-creation linguistics
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

328

Forks

43

Language

Python

License

MIT

Last pushed

Sep 24, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/keonlee9420/Comprehensive-Transformer-TTS"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.