SMIL-SPCRAS/DAVIS

Official repo for "Audio-Visual Speech Recognition In-the-Wild: Multi-Angle Vehicle Cabin Corpus and Attention-based Method" in ICASSP 2024

/ 100

Experimental

This project provides a unique dataset and an advanced method for understanding speech in noisy vehicle environments, even from different camera angles. It takes in audio and video recordings of people speaking in cars and delivers highly accurate transcriptions of their voice commands. This is invaluable for researchers and developers building robust voice control systems for in-car applications, especially for languages beyond English.

No commits in the last 6 months.

Use this if you are developing or testing speech recognition systems for cars and need realistic, 'in-the-wild' data with varied angles and background noise.

Not ideal if your focus is on general-purpose speech recognition outside of vehicle environments or if you require a simple, ready-to-use API.

in-car voice control automotive HMI speech recognition human-machine interaction audio-visual processing

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

JavaScript

License

—

Featured in

Things AI Won't Tell You About Building a Voice App Choosing a Voice AI Library in 2026: What's Actually Worth Building On

Higher-rated alternatives

Uberi/speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.

cmusphinx/pocketsphinx

A small speech recognizer

tensorflow/lingvo

Lingvo

modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models,...

PyThaiNLP/pythaiasr

Python Thai Automatic Speech Recognition

Explore Voice AI Tools

All categories Trending Voice AI directory Insights