kevobt/speech-to-text
Speech recognition framework using keras
This framework helps you build your own custom speech recognition system. You provide audio recordings paired with their exact text transcripts, and it trains a neural network model. The output is a trained model that can convert new audio files into text. This is designed for researchers or developers who need to create specialized speech-to-text capabilities for specific domains or languages.
No commits in the last 6 months.
Use this if you need to train a speech recognition model on your unique dataset of audio and text, perhaps for a specialized vocabulary or language not well-covered by existing off-the-shelf solutions.
Not ideal if you simply need to transcribe audio using a pre-trained, general-purpose speech-to-text service without needing to build or customize the underlying model.
Stars
14
Forks
—
Language
Python
License
GPL-3.0
Category
Last pushed
May 18, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/kevobt/speech-to-text"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
julius-speech/julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
rolczynski/Automatic-Speech-Recognition
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
tabahi/formantfeatures
Extract frequency, power, width and dissonance of formants from wav files
libdriver/ld3320
LD3320 full-featured driver library for general-purpose MCU and Linux.
awsaf49/audio_classification_models
Tensorflow Audio Classification Models