anhvung/Capstone-Audio-Transcription
Exploring different ASR and language models for audio transcription
This project helps researchers and product developers create accurate text transcripts from audio recordings. It takes audio files, even those with background noise or different accents, and converts them into written text. It's ideal for anyone who needs to convert spoken language into text for analysis, search, or documentation.
No commits in the last 6 months.
Use this if you need to transcribe audio content and want to fine-tune a model for better accuracy on specific types of speech, accents, or languages.
Not ideal if you just need basic, off-the-shelf transcription and don't plan to customize or improve the model's performance with your own data.
Stars
8
Forks
—
Language
Jupyter Notebook
License
—
Category
Last pushed
Dec 18, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/anhvung/Capstone-Audio-Transcription"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Picovoice/rhino
On-device Speech-to-Intent engine powered by deep learning
yandexdataschool/speech_course
YSDA course in Speech Processing.
MycroftAI/adapt
Adapt Intent Parser
Picovoice/speech-to-intent-benchmark
benchmark for Speech-to-Intent engines
IBM/BigLittleNet
Official repository for Big-Little Net