qizekun/ShapeLLM

[ECCV 2024] ShapeLLM: Universal 3D Object Understanding for Embodied Interaction

/ 100

Emerging

This project helps roboticists and augmented reality developers enable robots or AR systems to understand 3D objects in the real world through natural language. You provide the system with 3D scans (point clouds) of objects and text questions, and it outputs text answers describing or identifying those objects. It's designed for anyone building interactive systems that need to 'see' and 'talk about' their physical environment.

228 stars. No commits in the last 6 months.

Use this if you are developing an embodied AI system or an augmented reality application that needs to interpret 3D object data from sensors and respond to user queries in natural language.

Not ideal if your application primarily involves 2D image analysis, generating new 3D models, or requires highly precise 3D measurements rather than high-level object understanding.

robotics augmented-reality 3D-object-recognition embodied-AI human-robot-interaction

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

228

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

KimMeen/Time-LLM

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming...

om-ai-lab/VLM-R1

Solve Visual Understanding with Reinforced VLMs

bytedance/SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

NVlabs/OmniVinci

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.

fixie-ai/ultravox

A fast multimodal LLM for real-time voice

Explore Transformer Models

All categories Trending Transformer directory Insights