michaelzhang-ai/Speech2Video
ACCV 2020 "Speech2Video Synthesis with 3D Skeleton Regularization and Expressive Body Poses"
This project helps create a photo-realistic video of a person speaking, complete with natural body movements, from just an audio recording of their speech. It takes speech audio as input and produces a high-quality video where the person's lip movements, head gestures, and body language are synchronized and expressive. This tool is ideal for content creators, educators, or presenters who need to generate realistic speaking avatars.
100 stars.
Use this if you need to create a video of a specific person speaking, with synchronized and expressive body language, using only their voice recording.
Not ideal if you need to animate a fictional character or require highly customized, artistic control over every aspect of the animation.
Stars
100
Forks
9
Language
—
License
—
Category
Last pushed
Feb 27, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/michaelzhang-ai/Speech2Video"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Mrkomiljon/awesome-generative-ai
Multimodal generative AI resources : talking heads, STT, TTS, image & video generation, and more.
NVIDIA/Maya-ACE
Maya-ACE: A Reference Client Implementation for NVIDIA ACE Audio2Face Service
OpenVGLab/OmniLottie
[CVPR 2026🔥] 🧑🎨 OmniLottie, an open-sourced multi-modal instructed vector animation generator...
jdh-algo/JoyHallo
JoyHallo: Digital human model for Mandarin
Boese0601/Dyadic-Interaction-Modeling
[ECCV 2024] Dyadic Interaction Modeling for Social Behavior Generation