SpringerNLP/Chapter12

Chapter 12: End-to-end Speech Recognition

/ 100

Experimental

This project helps you turn spoken audio into written text using a powerful deep learning model. You provide audio recordings, and it delivers the corresponding transcriptions. It's designed for researchers, students, or practitioners who need to convert large volumes of speech into text for analysis or further processing.

No commits in the last 6 months.

Use this if you are exploring or implementing advanced speech-to-text conversion and need a robust, pre-configured environment with a well-known architecture.

Not ideal if you need an out-of-the-box API for general speech transcription or don't have access to GPU hardware and Docker.

speech-recognition audio-transcription natural-language-processing AI-research

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

TensorSpeech/TensorFlowASR

:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....

dangvansam/viet-asr

VietASR - Vietnamese Automatic Speech Recognition

wenet-e2e/wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

xinjli/allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

srvk/eesen

The official repository of the Eesen project

Explore Voice AI Tools

All categories Trending Voice AI directory Insights