astorfi/lip-reading-deeplearning

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

/ 100

Established

This project helps researchers and engineers build systems that can understand spoken language by analyzing both sound and lip movements. You provide video clips with synchronized audio and video, and the system learns to match them, determining if the lip movements correspond to the spoken words. This is useful for anyone working on robust speech recognition, speaker verification, or human-computer interaction.

1,901 stars. No commits in the last 6 months.

Use this if you need to determine the correspondence between audio speech and visual lip movements, especially in challenging environments where audio quality might be poor.

Not ideal if you're looking for a complete, production-ready lip-reading application with a pre-built input pipeline, as users must prepare their own video and audio inputs.

speech-recognition audio-visual-analysis biometrics human-computer-interaction video-analytics

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 24 / 25

How are scores calculated?

Stars

1,901

Forks

333

Language

Python

License

Apache-2.0

Related frameworks

deepconvolution/LipNet

Automated Lip reading from real-time videos in tensorflow in python

articulateinstruments/DeepLabCut-for-Speech-Production

Trained deep neural-net models for estimating articulatory keypoints from midsagittal ultrasound...

MrfoxAK/Evaluate-Lip-reading-using-Deep-Learning-Techniques.

This paper explores Silent Sound Technology, focusing on its potential to enhance communication...

BenedettoSimone/Lipnet-ITA

LipReadingITA: Keras implementation of the method described in the paper 'LipNet: End-to-End...

Cl0ud-9/Lip-Sync-Video-Generator

An AI-powered pipeline that transforms text into realistic lip-synced talking face videos using...

Explore ML Frameworks

All categories Trending ML Framework directory Insights