cvqluu/TDNN

Time delay neural network (TDNN) implementation in Pytorch using unfold method

/ 100

Emerging

This is a tool for machine learning engineers working on speech processing. It helps them build neural networks that analyze sequences of sound features, like Mel-frequency cepstral coefficients (MFCCs). You input these feature sequences, and it produces an output sequence that captures temporal dependencies, suitable for tasks like speaker recognition or speech command understanding. It's used by ML engineers specializing in audio and voice technologies.

204 stars. No commits in the last 6 months.

Use this if you are a machine learning engineer developing speech recognition or speaker verification systems and need to implement time delay neural networks.

Not ideal if you are not a developer and are looking for an off-the-shelf solution for audio analysis rather than a building block for neural networks.

speech-processing speaker-recognition voice-technology audio-analysis neural-network-development

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 21 / 25

How are scores calculated?

Stars

204

Forks

Language

Python

License

—

Higher-rated alternatives

kan-bayashi/ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

fatchord/WaveRNN

WaveRNN Vocoder + TTS

shangeth/wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...

rishikksh20/iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...

seungwonpark/melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Explore Voice AI Tools

All categories Trending Voice AI directory Insights