astorfi/lip-reading-deeplearning

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

50
/ 100
Established

This project helps researchers and engineers build systems that can understand spoken language by analyzing both sound and lip movements. You provide video clips with synchronized audio and video, and the system learns to match them, determining if the lip movements correspond to the spoken words. This is useful for anyone working on robust speech recognition, speaker verification, or human-computer interaction.

1,901 stars. No commits in the last 6 months.

Use this if you need to determine the correspondence between audio speech and visual lip movements, especially in challenging environments where audio quality might be poor.

Not ideal if you're looking for a complete, production-ready lip-reading application with a pre-built input pipeline, as users must prepare their own video and audio inputs.

speech-recognition audio-visual-analysis biometrics human-computer-interaction video-analytics
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 24 / 25

How are scores calculated?

Stars

1,901

Forks

333

Language

Python

License

Apache-2.0

Last pushed

Nov 07, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/astorfi/lip-reading-deeplearning"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.