zzw922cn/LPC_for_TTS
Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.
This tool helps speech synthesis researchers or engineers analyze and process audio for text-to-speech (TTS) systems. It takes raw audio files, converts them into mel-spectrograms, and then extracts Linear Prediction Coefficients (LPC). These coefficients can be used for feature extraction in advanced speech synthesizers like LPCNet.
No commits in the last 6 months.
Use this if you need to derive Linear Prediction Coefficients from audio data, specifically through a mel-spectrogram intermediate step, for speech synthesis or analysis.
Not ideal if you are looking for a complete text-to-speech synthesis solution or a tool for general audio analysis unrelated to speech modeling.
Stars
71
Forks
11
Language
Python
License
—
Category
Last pushed
Mar 19, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/zzw922cn/LPC_for_TTS"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
fatchord/WaveRNN
WaveRNN Vocoder + TTS
shangeth/wavencoder
WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...
rishikksh20/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...
seungwonpark/melgan
MelGAN vocoder (compatible with NVIDIA/tacotron2)