matteo-convertino/vosk-build-model
How to create your own model for vosk
This guide helps developers who want to create a custom speech-to-text model for Vosk, a popular open-source speech recognition toolkit. It walks through the process of preparing your own audio and text data, configuring the training environment using Kaldi, and then building a custom language model specific to your needs. This is useful for anyone looking to transcribe audio with domain-specific vocabulary or accents not well-covered by general-purpose models.
No commits in the last 6 months.
Use this if you need to build a specialized speech recognition model for Vosk that accurately transcribes audio containing unique vocabulary, proper nouns, or specific speaking styles relevant to your industry or project.
Not ideal if you're a non-developer seeking an out-of-the-box solution or if you don't have access to computational resources for model training.
Stars
75
Forks
21
Language
Shell
License
MIT
Category
Last pushed
Aug 14, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/matteo-convertino/vosk-build-model"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
pnnbao97/VieNeu-TTS
Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference • 24kHz audio...
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
babysor/MockingBird
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
r9y9/nnmnkwii
Library to build speech synthesis systems designed for easy and fast prototyping.
Softcatala/open-dubbing
Open dubbing is an AI dubbing system which uses machine learning models to automatically...