azraelkuan/FFTNet

FFTNet: a Real-Time Speaker-Dependent Neural Vocoder

/ 100

Emerging

This project helps generate realistic, human-like speech from existing audio recordings, making a speaker's voice "sing" new words. It takes processed audio features (like pitch and volume) from a single speaker's voice and outputs high-quality, real-time synthesized speech in that speaker's unique style. Voice-over artists, content creators, or anyone needing to create custom spoken audio from a specific voice could use this.

No commits in the last 6 months.

Use this if you need to create new speech utterances in a specific person's voice, particularly for applications requiring real-time audio generation.

Not ideal if you need to synthesize speech from text without a pre-existing audio training set from a specific speaker, or if you require a multi-speaker system.

speech-synthesis voice-generation audio-production content-creation voice-cloning

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 8 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

kan-bayashi/ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

fatchord/WaveRNN

WaveRNN Vocoder + TTS

shangeth/wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...

rishikksh20/iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...

seungwonpark/melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Explore Voice AI Tools

All categories Trending Voice AI directory Insights