MorrisXu-Driving/Improving_DeepSpeech_2_by_RNN_Transducer_Pytorch_Implementation

In this repository, based on Deep Speech 2, two losses, CTC and RNN-T are compared.

/ 100

Experimental

This project helps machine learning engineers and researchers improve Automatic Speech Recognition (ASR) model performance. It compares the DeepSpeech 2 model with an enhanced version using an RNN-Transducer, taking audio features as input and outputting transcribed text. This is for those working on building or optimizing speech-to-text systems.

No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher specifically working on optimizing DeepSpeech 2 for better word error rates in speech recognition tasks.

Not ideal if you are a non-technical end-user looking for a ready-to-use speech-to-text application, as this project focuses on model comparison and improvement for developers.

Automatic Speech Recognition ASR model optimization Deep Learning for Audio RNN-Transducer Speech-to-Text Development

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 8 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

TensorSpeech/TensorFlowASR

:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....

dangvansam/viet-asr

VietASR - Vietnamese Automatic Speech Recognition

wenet-e2e/wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

xinjli/allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

srvk/eesen

The official repository of the Eesen project

Explore Voice AI Tools

All categories Trending Voice AI directory Insights