rishikksh20/TalkNet2-pytorch
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.
This project is a work-in-progress to help generate natural-sounding speech from text. It takes written text as input and produces high-quality audio that sounds like a human speaking, with control over elements like pitch and speech timing. It would be used by a developer who needs to integrate advanced text-to-speech capabilities into an application.
No commits in the last 6 months.
Use this if you are a developer looking for an advanced, non-autoregressive speech synthesis model to build into your voice applications, once the project is complete.
Not ideal if you need a ready-to-use application for text-to-speech conversion or if you are not a developer, as this requires significant technical setup and is still under development.
Stars
89
Forks
6
Language
Python
License
MIT
Category
Last pushed
May 27, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/rishikksh20/TalkNet2-pytorch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
bshall/Tacotron
A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Kyubyong/dc_tts
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
DemisEom/SpecAugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
Kyubyong/tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model