rishikksh20/UnivNet-pytorch
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
This project helps audio engineers and researchers generate highly realistic, natural-sounding speech from spectrograms. You input a spectrogram, which is a visual representation of the frequencies in an audio signal, and it outputs a high-fidelity audio waveform. It's designed for professionals working on voice synthesis, text-to-speech systems, or audio manipulation.
No commits in the last 6 months.
Use this if you need to transform spectrograms into high-quality, lifelike audio, especially for speech synthesis applications.
Not ideal if you are looking for a complete, end-to-end voice cloning or text-to-speech toolbox without needing to interact with spectrograms directly.
Stars
76
Forks
9
Language
Python
License
MIT
Category
Last pushed
Aug 30, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/rishikksh20/UnivNet-pytorch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
fatchord/WaveRNN
WaveRNN Vocoder + TTS
shangeth/wavencoder
WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...
rishikksh20/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...
seungwonpark/melgan
MelGAN vocoder (compatible with NVIDIA/tacotron2)