real-stanford/semantic-abstraction
[CoRL 2022] This repository contains code for generating relevancies, training, and evaluating Semantic Abstraction.
This project helps roboticists and AI researchers enable robots to better understand and interact with 3D environments using ordinary 2D cameras. You input 2D images or video of a scene, along with text descriptions of objects you're looking for, and it outputs a 'relevancy map' showing where those objects are likely to be in the 3D space. It's designed for those developing AI systems that need to identify and locate a wide variety of objects, even those it hasn't been explicitly trained on, in real-world settings.
115 stars. No commits in the last 6 months.
Use this if you need to equip a robot or an autonomous system with the ability to find and localize diverse, potentially unfamiliar objects within complex 3D scenes using standard visual inputs.
Not ideal if your application requires identifying objects solely from a predefined, closed set of categories, or if you are not working with 3D scene understanding for robotics/AI.
Stars
115
Forks
6
Language
Python
License
MIT
Category
Last pushed
Mar 09, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/real-stanford/semantic-abstraction"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-mmlab/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
facebookresearch/mmf
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
HuaizhengZhang/Awsome-Deep-Learning-for-Video-Analysis
Papers, code and datasets about deep learning and multi-modal learning for video analysis
KaiyangZhou/pytorch-vsumm-reinforce
Unsupervised video summarization with deep reinforcement learning (AAAI'18)
adambielski/siamese-triplet
Siamese and triplet networks with online pair/triplet mining in PyTorch