mozilla-ai/speech-to-text-finetune
Blueprint by Mozilla.ai for finetuning a Speech-To-Text model in your own language
This tool helps you accurately transcribe spoken audio into text, especially for languages or accents that general speech-to-text tools might struggle with. You provide your own audio recordings and their correct transcriptions to create a specialized speech recognition model. It's designed for language experts, researchers, or content creators who need highly accurate transcriptions for specific audio.
Use this if you need to create a high-accuracy speech-to-text model tailored to a unique language, dialect, or specialized vocabulary, and you have access to example audio and text pairs.
Not ideal if you just need to transcribe common languages with standard accuracy and don't want to invest time in creating a custom dataset or training a model.
Stars
63
Forks
9
Language
Python
License
Apache-2.0
Category
Last pushed
Oct 23, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/mozilla-ai/speech-to-text-finetune"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TuananhCR/Dia-Finetuning-Vietnamese
TTS Dia finetuning for Vietnamese
dangvansam/viet-tts
VietTTS: An Open-Source Vietnamese Text to Speech
thinhlpg/vixtts-demo
A Vietnamese Voice Cloning Text-to-Speech Model ✨
NTT123/vietTTS
Vietnamese Text to Speech library
ekwek1/soprano-factory
Soprano-Factory: Train your own 2000x realtime text-to-speech model