ViTAE-Transformer/ViTAE-VSA

The official repo for [ECCV'22] "VSA: Learning Varied-Size Window Attention in Vision Transformers"

30
/ 100
Emerging

This project helps computer vision researchers and AI practitioners improve their models for analyzing images. It takes raw image data and processes it using an advanced attention mechanism to yield more accurate results for tasks like classifying images, detecting objects within them, and segmenting image regions. Data scientists, machine learning engineers, and vision AI specialists working on complex image understanding problems would use this.

158 stars. No commits in the last 6 months.

Use this if you are building or enhancing computer vision models and need improved accuracy for tasks like image classification, object detection, or semantic segmentation.

Not ideal if you are looking for an off-the-shelf application or do not have experience implementing and training deep learning models.

image-recognition object-detection semantic-segmentation computer-vision deep-learning-research
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 10 / 25

How are scores calculated?

Stars

158

Forks

9

Language

Python

License

Last pushed

Sep 25, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/ViTAE-Transformer/ViTAE-VSA"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.