GuanRunwei/Awesome-Vision-Transformer-Collection

Variants of Vision Transformer and its downstream tasks

/ 100

Emerging

This is a curated collection of research papers and associated code for various Vision Transformer models. It helps researchers and engineers quickly find and understand different approaches to processing image and video data using transformer architectures. You would use this to explore the state-of-the-art in visual AI models and their applications.

257 stars. No commits in the last 6 months.

Use this if you are a researcher or AI engineer looking for a comprehensive overview of Vision Transformer models for tasks like image classification, video analysis, or point cloud processing.

Not ideal if you are looking for an off-the-shelf tool or library to directly apply Vision Transformers without deep technical understanding.

computer-vision image-processing video-analysis 3d-data-processing machine-learning-research

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 16 / 25

How are scores calculated?

Stars

257

Forks

Language

—

License

—

Higher-rated alternatives

Jittor/jittor

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

zhanghang1989/ResNeSt

ResNeSt: Split-Attention Networks

berniwal/swin-transformer-pytorch

Implementation of the Swin Transformer in PyTorch.

NVlabs/FasterViT

[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with...

ViTAE-Transformer/ViTPose

The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose...

Explore ML Frameworks

All categories Trending ML Framework directory Insights