3dlg-hcvc/multi3drefer
[ICCV 2023] Multi3DRefer: Grounding Text Description to Multiple 3D Objects
This project helps professionals in fields like augmented reality, robotics, or interior design to precisely identify and locate multiple 3D objects within a scanned indoor environment based on a natural language description. You input a 3D scan of a room and a sentence describing certain objects, and it outputs the bounding boxes and identities of those objects. This is ideal for anyone needing to bridge the gap between human language instructions and detailed 3D scene understanding.
Use this if you need to programmatically identify and select specific objects in a complex 3D scanned scene using natural language descriptions, especially when multiple objects fit the description.
Not ideal if your task involves simple object recognition without textual descriptions, or if you only work with 2D images.
Stars
94
Forks
4
Language
Python
License
MIT
Category
Last pushed
Oct 18, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/3dlg-hcvc/multi3drefer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
3DOM-FBK/deep-image-matching
Multiview matching with deep-learning and hand-crafted local features for COLMAP and other SfM...
suhangpro/mvcnn
Multi-view CNN (MVCNN) for shape recognition
zouchuhang/LayoutNet
Torch implementation of our CVPR 18 paper: "LayoutNet: Reconstructing the 3D Room Layout from a...
andyzeng/tsdf-fusion-python
Python code to fuse multiple RGB-D images into a TSDF voxel volume.
andyzeng/tsdf-fusion
Fuse multiple depth frames into a TSDF voxel volume.