cvqluu/TDNN
Time delay neural network (TDNN) implementation in Pytorch using unfold method
This is a tool for machine learning engineers working on speech processing. It helps them build neural networks that analyze sequences of sound features, like Mel-frequency cepstral coefficients (MFCCs). You input these feature sequences, and it produces an output sequence that captures temporal dependencies, suitable for tasks like speaker recognition or speech command understanding. It's used by ML engineers specializing in audio and voice technologies.
204 stars. No commits in the last 6 months.
Use this if you are a machine learning engineer developing speech recognition or speaker verification systems and need to implement time delay neural networks.
Not ideal if you are not a developer and are looking for an off-the-shelf solution for audio analysis rather than a building block for neural networks.
Stars
204
Forks
40
Language
Python
License
—
Category
Last pushed
Nov 21, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/cvqluu/TDNN"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
fatchord/WaveRNN
WaveRNN Vocoder + TTS
shangeth/wavencoder
WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...
rishikksh20/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...
seungwonpark/melgan
MelGAN vocoder (compatible with NVIDIA/tacotron2)