EagleW/Multimedia-Generative-Script-Learning
Official implementation of the ACL Findings 2023 paper: Multimedia Generative Script Learning for Task Planning
This project helps generate the next logical steps for a given task, based on both text descriptions and images of previous steps. You input a task's title, previous method, a list of step texts, corresponding image captions, and the last step's image, and it outputs the predicted next step text and image. It's designed for developers building automated task planning or instructional systems, particularly for crafts and gardening.
No commits in the last 6 months.
Use this if you are a machine learning researcher or developer working on AI models that need to predict sequential, multi-modal steps for tasks like crafts or gardening.
Not ideal if you are a casual user looking for a ready-to-use application to generate instructions, or if your tasks fall outside of detailed instructional sequences for crafts or gardening.
Stars
8
Forks
—
Language
Python
License
MIT
Category
Last pushed
Mar 18, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/EagleW/Multimedia-Generative-Script-Learning"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
WangRongsheng/awesome-LLM-resources
🧑🚀 全世界最好的LLM资料总结(多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the...
SylphAI-Inc/AdalFlow
AdalFlow: The library to build & auto-optimize LLM applications.
LazyAGI/LazyLLM
Easiest and laziest way for building multi-agent LLMs applications.
luhengshiwo/LLMForEverybody
每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈
katanaml/sparrow
Structured data extraction and instruction calling with ML, LLM and Vision LLM