Haochen-Wang409/ross3d

[ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness

/ 100

Experimental

Ross3D helps AI models understand complex 3D environments by enabling them to reconstruct full scenes from various camera angles. It takes in multiple 2D views or video frames of a 3D space and outputs a comprehensive understanding of that scene, like a 'Bird's-Eye-View.' This is useful for AI researchers and developers building systems that need to interpret and interact with physical 3D spaces.

No commits in the last 6 months.

Use this if you are developing AI models that need to accurately understand and interpret information from 3D environments, especially when working with limited 3D data.

Not ideal if your primary focus is on 2D image analysis or if you require an off-the-shelf application rather than a foundational AI model.

3D-scene-understanding robotics-perception AI-model-training computer-vision spatial-reasoning

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 3 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

col14m/cadrille

[ICLR2026] cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning

filaPro/cad-recode

[ICCV2025] CAD-Recode: Reverse Engineering CAD Code from Point Clouds

pengsongyou/openscene

[CVPR'23] OpenScene: 3D Scene Understanding with Open Vocabularies

worldbench/3EED

[NeurIPS 2025 DB Track] 3EED: Ground Everything Everywhere in 3D

cambrian-mllm/cambrian-s

Cambrian-S: Towards Spatial Supersensing in Video

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights