keonlee9420/Comprehensive-Tacotron2
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
This project helps you convert written text into natural-sounding speech, whether you need a single voice or multiple distinct voices. You provide written sentences or entire scripts, and it generates audio files of those words being spoken. This is ideal for content creators, educators, or anyone needing to generate high-quality spoken audio from text.
No commits in the last 6 months.
Use this if you need to transform text into human-like speech with options for single or multiple speaker voices.
Not ideal if you need to use specific, existing audio recordings to clone voices or train a voice model from very limited data.
Stars
48
Forks
15
Language
Python
License
MIT
Category
Last pushed
Jul 31, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/keonlee9420/Comprehensive-Tacotron2"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
yeyupiaoling/MASR
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
shivammehta25/Matcha-TTS
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
DigitalPhonetics/IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
gabrielmittag/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment