YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

31
/ 100
Emerging

This resource curates a list of academic papers focusing on how Large Language Models (LLMs) are used to create or modify various types of media like images, videos, 3D models, and audio. It allows researchers, academics, or AI practitioners to quickly find relevant studies by filtering through different modalities and generation techniques. The collection primarily provides academic papers and their associated code or project pages for those exploring the latest advancements in multimodal AI generation.

540 stars. No commits in the last 6 months.

Use this if you are a researcher or practitioner in AI looking for a structured overview and direct access to recent academic work on LLM-powered image, video, 3D, or audio generation and editing.

Not ideal if you are looking for ready-to-use software, practical tutorials, or development tools for building multimodal applications.

AI research Multimodal generation Deep learning Generative AI Academic literature
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 13 / 25

How are scores calculated?

Stars

540

Forks

30

Language

HTML

License

Last pushed

Apr 04, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/YingqingHe/Awesome-LLMs-meet-Multimodal-Generation"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.