yochaiye/LipVoicer
Official Code implementation for the ICLR paper "LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading"
This project helps generate speech from silent video footage, allowing you to hear what's being said even when there's no original audio. It takes a silent video as input and outputs a synchronized audio track with the spoken words. This is ideal for content creators, video editors, or archivists who need to restore or add audio to silent clips of people speaking.
No commits in the last 6 months.
Use this if you have silent video footage of someone speaking and need to generate realistic speech to accompany their lip movements.
Not ideal if you need to generate speech for videos that don't clearly show a person's lips, or if you're looking for general voice synthesis that isn't tied to a video.
Stars
86
Forks
13
Language
Python
License
MIT
Category
Last pushed
Sep 19, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/yochaiye/LipVoicer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PrunaAI/pruna
Pruna is a model optimization framework built for developers, enabling you to deliver faster,...
bytedance/LatentSync
Taming Stable Diffusion for Lip Sync!
haoheliu/AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference.
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
teticio/audio-diffusion
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead...