Kirili4ik/QuartzNet-ASR-pytorch

Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.

27
/ 100
Experimental

This project offers a pre-trained Automatic Speech Recognition (ASR) model, QuartzNet, specifically designed for converting spoken English into text. You provide audio recordings in English, and it outputs the transcribed text. It's ideal for anyone needing to quickly and accurately convert spoken language from datasets like CommonVoice into written form.

No commits in the last 6 months.

Use this if you need to transcribe English audio recordings, particularly from datasets similar to Mozilla Common Voice, into text.

Not ideal if you need a model trained on a language other than English or require transcription of highly specialized or noisy audio without further fine-tuning.

audio-transcription speech-to-text voice-data-processing content-analysis research-transcription
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 8 / 25
Community 13 / 25

How are scores calculated?

Stars

16

Forks

3

Language

Jupyter Notebook

License

Last pushed

Nov 05, 2020

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Kirili4ik/QuartzNet-ASR-pytorch"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.