daveredrum/ScanRefer

[ECCV 2020] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language

/ 100

Emerging

This project helps locate specific objects within 3D scans of indoor environments. You provide a 3D point cloud of a scanned room and a natural language description, like "the small red chair near the window." The system outputs the precise 3D bounding box around that described object. This tool is useful for researchers and engineers working with 3D scene understanding and object interaction.

295 stars. No commits in the last 6 months.

Use this if you need to programmatically identify and pinpoint objects in 3D scanned scenes using human-readable descriptions.

Not ideal if your input data consists of traditional 2D images or if you don't require highly precise 3D localization.

3D-scanning indoor-mapping robotics-perception scene-understanding augmented-reality

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

295

Forks

Language

Python

License

—

Higher-rated alternatives

peteanderson80/Matterport3DSimulator

AI Research Platform for Reinforcement Learning from Real Panoramic Images.

cambridgeltl/visual-spatial-reasoning

[TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.

clairecyq/whos-waldo

Who's Waldo? Linking People Across Text and Images. ICCV 2021.

TheShadow29/vognet-pytorch

[CVPR20] Video Object Grounding using Semantic Roles in Language Description...

jianghaojun/Awesome-3D-Vision-and-Language

A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D...

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights