yh1008/speech-to-text
mixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras
This project helps people who mix English and Chinese in the same sentences when speaking, like many bilingual individuals. It takes audio recordings that contain both languages spoken together and accurately transcribes them into text. The output is a written record of the mixed-language speech, useful for anyone who needs to convert their bilingual conversations into text.
No commits in the last 6 months.
Use this if you need to accurately convert spoken audio containing both Chinese and English in the same sentences into a written transcript.
Not ideal if your audio is purely monolingual (only English or only Chinese) or if you require translation between languages rather than transcription of mixed speech.
Stars
71
Forks
19
Language
Jupyter Notebook
License
—
Category
Last pushed
Nov 20, 2017
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/yh1008/speech-to-text"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
julius-speech/julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
rolczynski/Automatic-Speech-Recognition
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
tabahi/formantfeatures
Extract frequency, power, width and dissonance of formants from wav files
libdriver/ld3320
LD3320 full-featured driver library for general-purpose MCU and Linux.
awsaf49/audio_classification_models
Tensorflow Audio Classification Models