rishikksh20/iSTFT-Avocodo-pytorch

Ultrafast GAN based Vocoder for Text to Speech

/ 100

Emerging

This project helps convert text-based phonetic representations (mel-spectrograms) into audible speech. It takes in a mel-spectrogram as input and generates high-quality, synthetic human speech as output. This is useful for researchers and developers working on text-to-speech systems who need fast and efficient audio generation.

No commits in the last 6 months.

Use this if you are building a text-to-speech system and need an extremely fast vocoder for both training and generating audio.

Not ideal if your absolute highest priority is audio quality and you are not concerned with inference or training speed.

text-to-speech speech-synthesis audio-generation voice-AI

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

kan-bayashi/ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

fatchord/WaveRNN

WaveRNN Vocoder + TTS

shangeth/wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...

rishikksh20/iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...

seungwonpark/melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Explore Voice AI Tools

All categories Trending Voice AI directory Insights