Kirili4ik/QuartzNet-ASR-pytorch
Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.
This project offers a pre-trained Automatic Speech Recognition (ASR) model, QuartzNet, specifically designed for converting spoken English into text. You provide audio recordings in English, and it outputs the transcribed text. It's ideal for anyone needing to quickly and accurately convert spoken language from datasets like CommonVoice into written form.
No commits in the last 6 months.
Use this if you need to transcribe English audio recordings, particularly from datasets similar to Mozilla Common Voice, into text.
Not ideal if you need a model trained on a language other than English or require transcription of highly specialized or noisy audio without further fine-tuning.
Stars
16
Forks
3
Language
Jupyter Notebook
License
—
Category
Last pushed
Nov 05, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Kirili4ik/QuartzNet-ASR-pytorch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TensorSpeech/TensorFlowASR
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2....
dangvansam/viet-asr
VietASR - Vietnamese Automatic Speech Recognition
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
xinjli/allosaurus
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
srvk/eesen
The official repository of the Eesen project