researchmm/MM-Diffusion
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
This project helps content creators, animators, and researchers generate realistic, high-quality videos complete with synchronized audio. You can input random noise or existing video/audio, and it outputs a new video with perfectly matched sound. It’s ideal for quickly prototyping new scenes or sounds without extensive manual editing.
452 stars. No commits in the last 6 months.
Use this if you need to generate novel, realistic video content with integrated, natural-sounding audio for creative projects or research.
Not ideal if you require precise control over specific elements of the generated video or audio, as this tool focuses on autonomous joint generation.
Stars
452
Forks
25
Language
Python
License
MIT
Category
Last pushed
Jun 05, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/researchmm/MM-Diffusion"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hao-ai-lab/FastVideo
A unified inference and post-training framework for accelerated video generation.
ModelTC/LightX2V
Light Image Video Generation Inference Framework
thu-ml/TurboDiffusion
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
PKU-YuanGroup/Helios
Helios: Real Real-Time Long Video Generation Model
PKU-YuanGroup/MagicTime
[TPAMI 2025🔥] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators