MorrisXu-Driving/Improving_DeepSpeech_2_by_RNN_Transducer_Pytorch_Implementation

In this repository, based on Deep Speech 2, two losses, CTC and RNN-T are compared.

20
/ 100
Experimental

This project helps machine learning engineers and researchers improve Automatic Speech Recognition (ASR) model performance. It compares the DeepSpeech 2 model with an enhanced version using an RNN-Transducer, taking audio features as input and outputting transcribed text. This is for those working on building or optimizing speech-to-text systems.

No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher specifically working on optimizing DeepSpeech 2 for better word error rates in speech recognition tasks.

Not ideal if you are a non-technical end-user looking for a ready-to-use speech-to-text application, as this project focuses on model comparison and improvement for developers.

Automatic Speech Recognition ASR model optimization Deep Learning for Audio RNN-Transducer Speech-to-Text Development
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 8 / 25
Community 8 / 25

How are scores calculated?

Stars

8

Forks

1

Language

Python

License

Last pushed

May 24, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/MorrisXu-Driving/Improving_DeepSpeech_2_by_RNN_Transducer_Pytorch_Implementation"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.