ViTAE-Transformer/QFormer
The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"
This project provides an improved way for AI developers to analyze images, extracting key features more accurately and efficiently. By processing images to identify and classify objects, segments, or even human poses, it outputs enhanced machine vision models. Developers and researchers in computer vision can leverage this to build more robust AI applications.
235 stars. No commits in the last 6 months.
Use this if you are a computer vision engineer or researcher developing advanced image classification, object detection, semantic segmentation, or human pose estimation models and need to improve their accuracy and performance.
Not ideal if you are an end-user without a strong background in computer vision, deep learning, and Python development, as this is a foundational research tool.
Stars
235
Forks
10
Language
Python
License
MIT
Category
Last pushed
Sep 25, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/ViTAE-Transformer/QFormer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
BR-IDL/PaddleViT
:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+
pathak22/unsupervised-video
[CVPR 2017] Unsupervised deep learning using unlabelled videos on the web
IBM/CrossViT
Official implementation of CrossViT. https://arxiv.org/abs/2103.14899
NVlabs/GCVit
[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers
ViTAE-Transformer/ViTDet
Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object...