YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

/ 100

Emerging

This resource curates a list of academic papers focusing on how Large Language Models (LLMs) are used to create or modify various types of media like images, videos, 3D models, and audio. It allows researchers, academics, or AI practitioners to quickly find relevant studies by filtering through different modalities and generation techniques. The collection primarily provides academic papers and their associated code or project pages for those exploring the latest advancements in multimodal AI generation.

540 stars. No commits in the last 6 months.

Use this if you are a researcher or practitioner in AI looking for a structured overview and direct access to recent academic work on LLM-powered image, video, 3D, or audio generation and editing.

Not ideal if you are looking for ready-to-use software, practical tutorials, or development tools for building multimodal applications.

AI research Multimodal generation Deep learning Generative AI Academic literature

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 13 / 25

How are scores calculated?

Stars

540

Forks

Language

HTML

License

—

Higher-rated alternatives

chrisliu298/awesome-llm-unlearning

A resource repository for machine unlearning in large language models

worldbench/awesome-vla-for-ad

🌐 Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

hijkzzz/Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

zjukg/KG-MM-Survey

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

worldbench/awesome-spatial-intelligence

🌐 Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems

Explore LLM Tools

All categories Trending LLM Tool directory Insights