maum-ai/univnet

Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)

/ 100

Emerging

This project helps speech synthesis researchers and engineers create high-fidelity, natural-sounding audio from mel-spectrograms. It takes a mel-spectrogram as input and outputs a corresponding audio waveform. This is for professionals building text-to-speech systems or other applications requiring realistic synthetic speech.

282 stars. No commits in the last 6 months.

Use this if you need to convert mel-spectrograms into high-quality, natural-sounding speech waveforms efficiently.

Not ideal if you are looking for a complete text-to-speech system that handles text processing and acoustic modeling, as this only covers the vocoder component.

speech-synthesis text-to-speech audio-generation neural-vocoder synthetic-media

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

282

Forks

Language

Python

License

BSD-3-Clause

Higher-rated alternatives

yeyupiaoling/MASR

Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conformer、Squeezeformer、DeepSpeech2模型，支持多种数据增强方法。

shivammehta25/Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

coqui-ai/TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

DigitalPhonetics/IMS-Toucan

Controllable and fast Text-to-Speech for over 7000 languages!

gabrielmittag/NISQA

NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment

Explore Voice AI Tools

All categories Trending Voice AI directory Insights