ViTAE-Transformer/QFormer

The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"

/ 100

Emerging

This project provides an improved way for AI developers to analyze images, extracting key features more accurately and efficiently. By processing images to identify and classify objects, segments, or even human poses, it outputs enhanced machine vision models. Developers and researchers in computer vision can leverage this to build more robust AI applications.

235 stars. No commits in the last 6 months.

Use this if you are a computer vision engineer or researcher developing advanced image classification, object detection, semantic segmentation, or human pose estimation models and need to improve their accuracy and performance.

Not ideal if you are an end-user without a strong background in computer vision, deep learning, and Python development, as this is a foundational research tool.

image-recognition object-detection semantic-segmentation pose-estimation computer-vision-research

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 9 / 25

How are scores calculated?

Stars

235

Forks

Language

Python

License

MIT

Higher-rated alternatives

BR-IDL/PaddleViT

:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

pathak22/unsupervised-video

[CVPR 2017] Unsupervised deep learning using unlabelled videos on the web

IBM/CrossViT

Official implementation of CrossViT. https://arxiv.org/abs/2103.14899

NVlabs/GCVit

[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers

ViTAE-Transformer/ViTDet

Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object...

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights