rishikksh20/iSTFT-Avocodo-pytorch
Ultrafast GAN based Vocoder for Text to Speech
This project helps convert text-based phonetic representations (mel-spectrograms) into audible speech. It takes in a mel-spectrogram as input and generates high-quality, synthetic human speech as output. This is useful for researchers and developers working on text-to-speech systems who need fast and efficient audio generation.
No commits in the last 6 months.
Use this if you are building a text-to-speech system and need an extremely fast vocoder for both training and generating audio.
Not ideal if your absolute highest priority is audio quality and you are not concerned with inference or training speed.
Stars
50
Forks
7
Language
Python
License
MIT
Category
Last pushed
Jul 16, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/rishikksh20/iSTFT-Avocodo-pytorch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
fatchord/WaveRNN
WaveRNN Vocoder + TTS
shangeth/wavencoder
WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...
rishikksh20/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...
seungwonpark/melgan
MelGAN vocoder (compatible with NVIDIA/tacotron2)