sangramsingnk/Audio-Feature-Extraction
In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. Mel-frequency cepstral coefficients (MFCCs) are coefficients that collectively make up an MFC.
This tool helps researchers and engineers analyze audio recordings by extracting key characteristics like Mel-frequency cepstral coefficients (MFCCs), volume, pitch, and zero-crossing rate. You input raw audio files, and it outputs these extracted features, which can then be used for tasks like speaker identification or speech recognition. It's designed for someone working with speech datasets and needing to preprocess audio for further analysis.
No commits in the last 6 months.
Use this if you need to extract specific acoustic features from speech recordings to prepare them for machine learning models or detailed sound analysis.
Not ideal if you're looking for a complete end-to-end speech recognition system or a tool for general audio editing and production.
Stars
9
Forks
7
Language
Jupyter Notebook
License
—
Category
Last pushed
Apr 20, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/sangramsingnk/Audio-Feature-Extraction"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
julius-speech/julius
Open-Source Large Vocabulary Continuous Speech Recognition Engine
rolczynski/Automatic-Speech-Recognition
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
tabahi/formantfeatures
Extract frequency, power, width and dissonance of formants from wav files
libdriver/ld3320
LD3320 full-featured driver library for general-purpose MCU and Linux.
awsaf49/audio_classification_models
Tensorflow Audio Classification Models