kkakkkka/ETRIS

[ICCV-2023] The official code of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation

/ 100

Emerging

This helps computer vision researchers and practitioners efficiently identify and outline specific objects within images based on natural language descriptions. You provide an image and a text prompt (e.g., "the red car on the left"), and it outputs a precise mask highlighting that object. This is useful for anyone working with automated image analysis and semantic understanding.

138 stars. No commits in the last 6 months.

Use this if you need to precisely segment objects from images using descriptive text without extensive model retraining.

Not ideal if you require object detection or image classification without specific pixel-level segmentation, or if you don't have programming experience.

image-segmentation computer-vision visual-language-understanding object-localization AI-research

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 8 / 25

How are scores calculated?

Stars

138

Forks

Language

Python

License

MIT

Higher-rated alternatives

BR-IDL/PaddleViT

:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

pathak22/unsupervised-video

[CVPR 2017] Unsupervised deep learning using unlabelled videos on the web

IBM/CrossViT

Official implementation of CrossViT. https://arxiv.org/abs/2103.14899

NVlabs/GCVit

[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers

ViTAE-Transformer/ViTDet

Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object...

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights