InternRobotics/PointLLM
[ECCV 2024 Best Paper Candidate & TPAMI 2025] PointLLM: Empowering Large Language Models to Understand Point Clouds
This project offers a specialized Large Language Model (LLM) designed to interpret 3D object data. You input raw 3D point clouds, and the model understands the object's type, shape, and appearance, producing descriptive text. This is for researchers and engineers working with 3D sensor data who need to automatically describe or classify objects.
983 stars. No commits in the last 6 months.
Use this if you need an AI model that can understand and generate descriptions directly from 3D point cloud data, without being confused by depth ambiguities or occlusions.
Not ideal if your primary data source is 2D images or videos, or if you don't work with raw 3D point cloud representations.
Stars
983
Forks
53
Language
Python
License
—
Category
Last pushed
Aug 14, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/InternRobotics/PointLLM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TinyLLaVA/TinyLLaVA_Factory
A Framework of Small-scale Large Multimodal Models
zjunlp/EasyInstruct
[ACL 2024] An Easy-to-use Instruction Processing Framework for LLMs.
rese1f/MovieChat
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
NVlabs/Eagle
Eagle: Frontier Vision-Language Models with Data-Centric Strategies