Kirili4ik/QuartzNet-ASR-pytorch

Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.

/ 100

Experimental

This project offers a pre-trained Automatic Speech Recognition (ASR) model, QuartzNet, specifically designed for converting spoken English into text. You provide audio recordings in English, and it outputs the transcribed text. It's ideal for anyone needing to quickly and accurately convert spoken language from datasets like CommonVoice into written form.

No commits in the last 6 months.

Use this if you need to transcribe English audio recordings, particularly from datasets similar to Mozilla Common Voice, into text.

Not ideal if you need a model trained on a language other than English or require transcription of highly specialized or noisy audio without further fine-tuning.

audio-transcription speech-to-text voice-data-processing content-analysis research-transcription

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 13 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

TensorSpeech/TensorFlowASR

:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....

dangvansam/viet-asr

VietASR - Vietnamese Automatic Speech Recognition

wenet-e2e/wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

xinjli/allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

srvk/eesen

The official repository of the Eesen project

Explore Voice AI Tools

All categories Trending Voice AI directory Insights