maetshju/flux-blstm-implementation

An implementation of the Graves & Schmidhuber (2005) bidirectional LSTM in Flux.

/ 100

Emerging

This project helps speech recognition researchers and engineers who need to classify phonemes from audio recordings. It takes raw speech audio as input and outputs framewise phoneme classifications. This is specifically for those working on advanced speech processing systems.

No commits in the last 6 months.

Use this if you are developing a speech recognition system and need an established, high-performance method for classifying individual speech sounds within audio frames.

Not ideal if you are looking for a general-purpose speech-to-text solution or a simpler machine learning model for different types of sequential data.

speech-recognition phoneme-classification audio-processing neural-networks linguistics

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

Julia

License

MIT

Higher-rated alternatives

kan-bayashi/ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

fatchord/WaveRNN

WaveRNN Vocoder + TTS

shangeth/wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...

rishikksh20/iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...

seungwonpark/melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Explore Voice AI Tools

All categories Trending Voice AI directory Insights