OpenM3D/M3DBench

[ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.

/ 100

Experimental

M3DBench is a large dataset designed to train AI models to understand and respond to complex instructions that combine text, images, and 3D objects. It helps researchers develop AI that can interpret real-world 3D environments, taking in diverse inputs like 3D scans and images, and producing intelligent responses. This is ideal for AI researchers and developers working on advanced 3D perception and reasoning.

No commits in the last 6 months.

Use this if you are developing or training large AI models that need to understand and interact with multi-modal 3D data, facilitating tasks like autonomous navigation or robotic interaction in complex environments.

Not ideal if you are looking for an out-of-the-box solution for a specific 3D task without any AI model development or training involved.

3D computer vision multi-modal AI robotics autonomous systems AI model training

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 8 / 25

Community 3 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

KimMeen/Time-LLM

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming...

om-ai-lab/VLM-R1

Solve Visual Understanding with Reinforced VLMs

bytedance/SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

NVlabs/OmniVinci

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.

fixie-ai/ultravox

A fast multimodal LLM for real-time voice

Explore Transformer Models

All categories Trending Transformer directory Insights