keonlee9420/Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
This project helps audio engineers, content creators, and AI researchers generate realistic human speech from written text. You provide text input, and it produces high-quality audio files in various voices and styles. It's designed for users who need to synthesize speech quickly and with fine-grained control over vocal characteristics.
328 stars. No commits in the last 6 months.
Use this if you need to generate high-quality, natural-sounding speech from text, especially if you require control over speaking rate, pitch, and volume, or want to experiment with different voice models.
Not ideal if you're looking for a simple, off-the-shelf text-to-speech solution without any technical setup or customization.
Stars
328
Forks
43
Language
Python
License
MIT
Category
Last pushed
Sep 24, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/keonlee9420/Comprehensive-Transformer-TTS"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
yeyupiaoling/MASR
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
shivammehta25/Matcha-TTS
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
DigitalPhonetics/IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
gabrielmittag/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment