tugstugi/pytorch-speech-commands
Speech commands recognition with PyTorch | Kaggle 10th place solution in TensorFlow Speech Recognition Challenge
This project offers a pre-trained model for recognizing simple voice commands, like "yes," "no," or "stop." It takes short audio clips (typically 1 second) as input and outputs the specific command spoken, enabling applications controlled by voice. It's designed for developers building voice-controlled interfaces or analyzing speech commands.
201 stars. No commits in the last 6 months.
Use this if you are a developer looking for a robust, pre-trained model to implement basic speech command recognition in your applications or research.
Not ideal if you need to recognize continuous speech, complex sentences, or commands not present in the Google Speech Commands dataset.
Stars
201
Forks
45
Language
Python
License
—
Category
Last pushed
Jan 19, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/tugstugi/pytorch-speech-commands"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
julius-speech/julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
rolczynski/Automatic-Speech-Recognition
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
tabahi/formantfeatures
Extract frequency, power, width and dissonance of formants from wav files
libdriver/ld3320
LD3320 full-featured driver library for general-purpose MCU and Linux.
awsaf49/audio_classification_models
Tensorflow Audio Classification Models