real-stanford/semantic-abstraction

[CoRL 2022] This repository contains code for generating relevancies, training, and evaluating Semantic Abstraction.

/ 100

Emerging

This project helps roboticists and AI researchers enable robots to better understand and interact with 3D environments using ordinary 2D cameras. You input 2D images or video of a scene, along with text descriptions of objects you're looking for, and it outputs a 'relevancy map' showing where those objects are likely to be in the 3D space. It's designed for those developing AI systems that need to identify and locate a wide variety of objects, even those it hasn't been explicitly trained on, in real-world settings.

115 stars. No commits in the last 6 months.

Use this if you need to equip a robot or an autonomous system with the ability to find and localize diverse, potentially unfamiliar objects within complex 3D scenes using standard visual inputs.

Not ideal if your application requires identifying objects solely from a predefined, closed set of categories, or if you are not working with 3D scene understanding for robotics/AI.

robotics 3D-scene-understanding computer-vision object-localization AI-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 9 / 25

How are scores calculated?

Stars

115

Forks

Language

Python

License

MIT

Higher-rated alternatives

open-mmlab/mmpretrain

OpenMMLab Pre-training Toolbox and Benchmark

facebookresearch/mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

HuaizhengZhang/Awsome-Deep-Learning-for-Video-Analysis

Papers, code and datasets about deep learning and multi-modal learning for video analysis

KaiyangZhou/pytorch-vsumm-reinforce

Unsupervised video summarization with deep reinforcement learning (AAAI'18)

adambielski/siamese-triplet

Siamese and triplet networks with online pair/triplet mining in PyTorch

Explore ML Frameworks

All categories Trending ML Framework directory Insights