3dlg-hcvc/multi3drefer

[ICCV 2023] Multi3DRefer: Grounding Text Description to Multiple 3D Objects

/ 100

Emerging

This project helps professionals in fields like augmented reality, robotics, or interior design to precisely identify and locate multiple 3D objects within a scanned indoor environment based on a natural language description. You input a 3D scan of a room and a sentence describing certain objects, and it outputs the bounding boxes and identities of those objects. This is ideal for anyone needing to bridge the gap between human language instructions and detailed 3D scene understanding.

Use this if you need to programmatically identify and select specific objects in a complex 3D scanned scene using natural language descriptions, especially when multiple objects fit the description.

Not ideal if your task involves simple object recognition without textual descriptions, or if you only work with 2D images.

3D-scene-understanding augmented-reality robotics-navigation interior-design-modeling spatial-query

No Package No Dependents

Maintenance 6 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

3DOM-FBK/deep-image-matching

Multiview matching with deep-learning and hand-crafted local features for COLMAP and other SfM...

suhangpro/mvcnn

Multi-view CNN (MVCNN) for shape recognition

zouchuhang/LayoutNet

Torch implementation of our CVPR 18 paper: "LayoutNet: Reconstructing the 3D Room Layout from a...

andyzeng/tsdf-fusion-python

Python code to fuse multiple RGB-D images into a TSDF voxel volume.

andyzeng/tsdf-fusion

Fuse multiple depth frames into a TSDF voxel volume.

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights