SvenWientjes/SpeechRecognition
Classifying sound signals as Links, Midden or Rechts using features computed using a Mel-Frequency filterbank, summing the power of the frequency-domain in the relevant filters. Dynamic Time Warping is used to find proper alignment between the unknown word and several labelled exemplars per word we are looking for. Then, k nearest neighbours tells us which is the most likely class for our unknown word.
This helps researchers or students in linguistics or signal processing classify short, distinct spoken words. You provide sound files of specific words, and it outputs a classification for an unknown word, along with visual plots of how it matched the sounds. This is ideal for those studying word recognition in small vocabularies.
No commits in the last 6 months.
Use this if you need to classify a very limited set of spoken words, like 'Links', 'Midden', or 'Rechts', from audio samples.
Not ideal if you need to recognize a large vocabulary of words, process continuous speech, or require real-time recognition.
Stars
7
Forks
1
Language
Matlab
License
—
Category
Last pushed
Jul 24, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/SvenWientjes/SpeechRecognition"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
julius-speech/julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
rolczynski/Automatic-Speech-Recognition
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
tabahi/formantfeatures
Extract frequency, power, width and dissonance of formants from wav files
libdriver/ld3320
LD3320 full-featured driver library for general-purpose MCU and Linux.
awsaf49/audio_classification_models
Tensorflow Audio Classification Models