m3hrdadfi/soxan

Wav2Vec for speech recognition, classification, and audio classification

44
/ 100
Emerging

This project helps researchers analyze spoken language and other audio by training models to identify specific characteristics. You can input audio files, such as speech recordings, and it will output classifications like the emotion expressed (e.g., anger, happiness) or the type of sound (e.g., eating sound). This is designed for scientists, linguists, or audio researchers working with large datasets of spoken words or environmental sounds.

273 stars. No commits in the last 6 months.

Use this if you need to train or use pre-trained models for tasks like recognizing emotions in speech or classifying specific audio events from sound recordings.

Not ideal if you are looking for an out-of-the-box application for real-time speech-to-text transcription or simple audio editing.

speech-emotion-recognition audio-classification linguistic-analysis sound-event-detection acoustics-research
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

273

Forks

38

Language

Jupyter Notebook

License

Apache-2.0

Last pushed

Apr 02, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/m3hrdadfi/soxan"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.