enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
This project helps you create custom text-to-speech voices for various applications. You provide audio recordings and corresponding text transcripts, and it generates speech from new text input in the voice learned from your data. This is ideal for voice actors, content creators, or businesses needing unique voiceovers.
2,992 stars. No commits in the last 6 months.
Use this if you want to train a text-to-speech model with a specific voice, rather than using a generic synthesized voice.
Not ideal if you need an out-of-the-box text-to-speech solution without custom voice training or if you lack access to GPU hardware.
Stars
2,992
Forks
406
Language
Python
License
MIT
Category
Last pushed
May 10, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/enhuiz/vall-e"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
bshall/Tacotron
A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Kyubyong/dc_tts
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
DemisEom/SpecAugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
Kyubyong/tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model