heng-hw/SpaCap3D
[IJCAI 2022] Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds (official pytorch implementation)
This project helps researchers and developers in computer vision automatically generate natural language descriptions for specific objects within 3D scans. You input 3D point cloud data, potentially with additional RGB or normal information, and it outputs concise, accurate captions describing individual objects identified in the scan. This is ideal for those working on tasks like robotic scene understanding or creating accessible descriptions of 3D environments.
No commits in the last 6 months.
Use this if you need to automatically generate detailed textual descriptions for objects found within complex 3D scans or point clouds.
Not ideal if your primary goal is general scene captioning without object-specific focus, or if you only have 2D image data.
Stars
21
Forks
5
Language
Python
License
—
Category
Last pushed
Aug 31, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/heng-hw/SpaCap3D"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ntrang086/image_captioning
generate captions for images using a CNN-RNN model that is trained on the Microsoft Common...
fregu856/CS224n_project
Neural Image Captioning in TensorFlow.
vacancy/SceneGraphParser
A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic...
ltguo19/VSUA-Captioning
Code for "Aligning Linguistic Words and Visual Semantic Units for Image Captioning", ACM MM 2019
Abdelrhman-Yasser/video-content-description
Video content description model for generating descriptions for unconstrained videos