ViTAE-Transformer/ViTAE-VSA

The official repo for [ECCV'22] "VSA: Learning Varied-Size Window Attention in Vision Transformers"

/ 100

Emerging

This project helps computer vision researchers and AI practitioners improve their models for analyzing images. It takes raw image data and processes it using an advanced attention mechanism to yield more accurate results for tasks like classifying images, detecting objects within them, and segmenting image regions. Data scientists, machine learning engineers, and vision AI specialists working on complex image understanding problems would use this.

158 stars. No commits in the last 6 months.

Use this if you are building or enhancing computer vision models and need improved accuracy for tasks like image classification, object detection, or semantic segmentation.

Not ideal if you are looking for an off-the-shelf application or do not have experience implementing and training deep learning models.

image-recognition object-detection semantic-segmentation computer-vision deep-learning-research

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 10 / 25

How are scores calculated?

Stars

158

Forks

Language

Python

License

—

Higher-rated alternatives

Jittor/jittor

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

berniwal/swin-transformer-pytorch

Implementation of the Swin Transformer in PyTorch.

zhanghang1989/ResNeSt

ResNeSt: Split-Attention Networks

NVlabs/FasterViT

[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with...

ViTAE-Transformer/ViTPose

The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose...

Explore ML Frameworks

All categories Trending ML Framework directory Insights