hirofumi0810/tensorflow_end2end_speech_recognition
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
This project helps researchers and developers build custom speech recognition systems. It takes audio recordings from popular speech datasets like TIMIT, LibriSpeech, or CSJ, and processes them to output text transcripts. It's designed for someone specializing in machine learning or natural language processing who needs to experiment with advanced end-to-end speech recognition models.
314 stars. No commits in the last 6 months.
Use this if you are developing or researching new speech-to-text models and need a robust, customizable framework.
Not ideal if you are a general user looking for an out-of-the-box speech recognition application or API.
Stars
314
Forks
119
Language
Python
License
MIT
Category
Last pushed
Jan 23, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/hirofumi0810/tensorflow_end2end_speech_recognition"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
githubharald/CTCDecoder
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon...
githubharald/CTCWordBeamSearch
Connectionist Temporal Classification (CTC) decoder with dictionary and language model.
nl8590687/ASRT_SpeechRecognition
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
athena-team/athena
an open-source implementation of sequence-to-sequence based speech processing engine
robmsmt/KerasDeepSpeech
A Keras CTC implementation of Baidu's DeepSpeech for model experimentation