maetshju/flux-blstm-implementation
An implementation of the Graves & Schmidhuber (2005) bidirectional LSTM in Flux.
This project helps speech recognition researchers and engineers who need to classify phonemes from audio recordings. It takes raw speech audio as input and outputs framewise phoneme classifications. This is specifically for those working on advanced speech processing systems.
No commits in the last 6 months.
Use this if you are developing a speech recognition system and need an established, high-performance method for classifying individual speech sounds within audio frames.
Not ideal if you are looking for a general-purpose speech-to-text solution or a simpler machine learning model for different types of sequential data.
Stars
11
Forks
3
Language
Julia
License
MIT
Category
Last pushed
May 14, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/maetshju/flux-blstm-implementation"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
fatchord/WaveRNN
WaveRNN Vocoder + TTS
shangeth/wavencoder
WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...
rishikksh20/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...
seungwonpark/melgan
MelGAN vocoder (compatible with NVIDIA/tacotron2)