SMIL-SPCRAS/DAVIS

Official repo for "Audio-Visual Speech Recognition In-the-Wild: Multi-Angle Vehicle Cabin Corpus and Attention-based Method" in ICASSP 2024

13
/ 100
Experimental

This project provides a unique dataset and an advanced method for understanding speech in noisy vehicle environments, even from different camera angles. It takes in audio and video recordings of people speaking in cars and delivers highly accurate transcriptions of their voice commands. This is invaluable for researchers and developers building robust voice control systems for in-car applications, especially for languages beyond English.

No commits in the last 6 months.

Use this if you are developing or testing speech recognition systems for cars and need realistic, 'in-the-wild' data with varied angles and background noise.

Not ideal if your focus is on general-purpose speech recognition outside of vehicle environments or if you require a simple, ready-to-use API.

in-car voice control automotive HMI speech recognition human-machine interaction audio-visual processing
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 0 / 25

How are scores calculated?

Stars

9

Forks

Language

JavaScript

License

Last pushed

Apr 08, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/SMIL-SPCRAS/DAVIS"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.