erogol/FFTNet
FFTNet vocoder implementation
This tool takes raw audio recordings, analyzes their sound characteristics, and then converts these characteristics into new, realistic-sounding speech. It's used by researchers and developers working on speech synthesis to efficiently generate human-like voices from audio features.
No commits in the last 6 months.
Use this if you need to transform the acoustic properties of audio, like a spectrogram, into high-fidelity speech waveforms for synthetic voice generation.
Not ideal if you're looking for a tool to transcribe speech to text, translate languages, or perform general audio editing.
Stars
81
Forks
8
Language
Jupyter Notebook
License
MPL-2.0
Category
Last pushed
Sep 28, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/erogol/FFTNet"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz...
drethage/speech-denoising-wavenet
A neural network for end-to-end speech denoising
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
iver56/torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
lmnt-com/wavegrad
A fast, high-quality neural vocoder.