OpenM3D/M3DBench
[ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.
M3DBench is a large dataset designed to train AI models to understand and respond to complex instructions that combine text, images, and 3D objects. It helps researchers develop AI that can interpret real-world 3D environments, taking in diverse inputs like 3D scans and images, and producing intelligent responses. This is ideal for AI researchers and developers working on advanced 3D perception and reasoning.
No commits in the last 6 months.
Use this if you are developing or training large AI models that need to understand and interact with multi-modal 3D data, facilitating tasks like autonomous navigation or robotic interaction in complex environments.
Not ideal if you are looking for an out-of-the-box solution for a specific 3D task without any AI model development or training involved.
Stars
61
Forks
1
Language
Python
License
—
Category
Last pushed
Oct 01, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/OpenM3D/M3DBench"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
KimMeen/Time-LLM
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming...
om-ai-lab/VLM-R1
Solve Visual Understanding with Reinforced VLMs
bytedance/SALMONN
SALMONN family: A suite of advanced multi-modal LLMs
NVlabs/OmniVinci
OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.
fixie-ai/ultravox
A fast multimodal LLM for real-time voice