Letian2003/MM_INF

An efficient multi-modal instruction-following data synthesis tool and the official implementation of Oasis https://arxiv.org/abs/2503.08741.

23
/ 100
Experimental

This tool helps AI researchers and data scientists efficiently create high-quality, diverse multimodal instruction-following datasets. You provide images, and it automatically generates various instructions and corresponding responses, which are crucial for training advanced multimodal large language models (MLLMs). It automates much of the data synthesis process, allowing you to focus on model development.

No commits in the last 6 months.

Use this if you need to rapidly generate large, diverse datasets of image-based instructions and responses to train or fine-tune multimodal AI models, particularly when starting with only raw images.

Not ideal if you primarily work with text-only data, already have high-quality annotated multimodal datasets, or are looking for a simple API for existing MLLMs rather than a data generation pipeline.

AI-training-data multimodal-AI LLM-fine-tuning synthetic-data-generation computer-vision
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 6 / 25

How are scores calculated?

Stars

39

Forks

2

Language

Python

License

Last pushed

Jun 04, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Letian2003/MM_INF"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.