Vision Transformer Optimization ML Frameworks
Official implementations and research papers focused on improving Vision Transformer architectures through efficiency enhancements, dynamic token pruning, hierarchical designs, and architectural innovations. Does NOT include general computer vision frameworks, multimodal models, or non-transformer-based vision approaches.
There are 103 vision transformer optimization frameworks tracked. 8 score above 50 (established tier). The highest-rated is Jittor/jittor at 59/100 with 3,221 stars. 1 of the top 10 are actively maintained.
Get all 103 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=vision-transformer-optimization&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Framework | Score | Tier |
|---|---|---|---|
| 1 |
Jittor/jittor
Jittor is a high-performance deep learning framework based on JIT compiling... |
|
Established |
| 2 |
berniwal/swin-transformer-pytorch
Implementation of the Swin Transformer in PyTorch. |
|
Established |
| 3 |
zhanghang1989/ResNeSt
ResNeSt: Split-Attention Networks |
|
Established |
| 4 |
NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision... |
|
Established |
| 5 |
ViTAE-Transformer/ViTPose
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer... |
|
Established |
| 6 |
sniklaus/pytorch-pwc
a reimplementation of PWC-Net in PyTorch that matches the official Caffe version |
|
Established |
| 7 |
microsoft/CvT
This is an official implementation of CvT: Introducing Convolutions to... |
|
Established |
| 8 |
gaohuang/MSDNet
Multi-Scale Dense Networks for Resource Efficient Image Classification (ICLR... |
|
Established |
| 9 |
Khrylx/AgentFormer
[ICCV 2021] Official PyTorch Implementation of "AgentFormer: Agent-Aware... |
|
Emerging |
| 10 |
tobna/WhatTransformerToFavor
Github repository for the paper Which Transformer to Favor: A Comparative... |
|
Emerging |
| 11 |
innat/DOLG-TensorFlow
Implementation of Deep Orthogonal Fusion of Local and Global Features in TensorFlow 2 |
|
Emerging |
| 12 |
google-research/big_transfer
Official repository for the "Big Transfer (BiT): General Visual... |
|
Emerging |
| 13 |
richzhang/PerceptualSimilarity
LPIPS metric. pip install lpips |
|
Emerging |
| 14 |
Renumics/mesh2vec
Turn CAE mesh data => aggregated element feature vectors for ML |
|
Emerging |
| 15 |
iduta/pyconv
Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual... |
|
Emerging |
| 16 |
vra/dinov2-retrieval
A cli program of image retrieval using dinov2 |
|
Emerging |
| 17 |
alon-albalak/TLiDB
Transfer Learning in Dialogue Benchmarking Toolkit |
|
Emerging |
| 18 |
bwconrad/vit-finetune
Fine-tuning Vision Transformers on various classification datasets |
|
Emerging |
| 19 |
clovaai/rexnet
Official Pytorch implementation of ReXNet (Rank eXpansion Network) with... |
|
Emerging |
| 20 |
walsvid/CoordConv
Pytorch implementation of "An intriguing failing of convolutional neural... |
|
Emerging |
| 21 |
PracticumAI/transfer_learning
Transfer learning is a powerful method allowing you to repurpose an AI model... |
|
Emerging |
| 22 |
VicenteVivan/geo-clip
This is an official PyTorch implementation of our NeurIPS 2023 paper... |
|
Emerging |
| 23 |
raoyongming/DynamicViT
[NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with... |
|
Emerging |
| 24 |
Yangzhangcst/Transformer-in-Computer-Vision
A paper list of some recent Transformer-based CV works. |
|
Emerging |
| 25 |
bryanlimy/V1T
[TMLR 2023] V1T: Large-scale mouse V1 response prediction using a Vision Transformer |
|
Emerging |
| 26 |
LeapLabTHU/DAT
Repository of Vision Transformer with Deformable Attention (CVPR2022) and... |
|
Emerging |
| 27 |
ShirAmir/dino-vit-features
Official implementation for the paper "Deep ViT Features as Dense Visual... |
|
Emerging |
| 28 |
kampta/DeepLayout
PyTorch implementation of "LayoutTransformer: Layout Generation and... |
|
Emerging |
| 29 |
thuml/Xlearn
Transfer Learning Library |
|
Emerging |
| 30 |
mit-han-lab/offsite-tuning
Offsite-Tuning: Transfer Learning without Full Model |
|
Emerging |
| 31 |
jwr1995/dc1d
A 1D implementation of a deformable convolutional layer in PyTorch with a few tricks. |
|
Emerging |
| 32 |
htdt/hyp_metric
Hyperbolic Vision Transformers: Combining Improvements in Metric Learning |... |
|
Emerging |
| 33 |
chenhaoxing/SSFormers
This repository is the code of the paper "Sparse Spatial Transformers for... |
|
Emerging |
| 34 |
fkodom/yet-another-retnet
A simple but robust PyTorch implementation of RetNet from "Retentive... |
|
Emerging |
| 35 |
dongkyunk/DOLG-pytorch
Unofficial PyTorch Implementation of "DOLG: Single-Stage Image Retrieval... |
|
Emerging |
| 36 |
intel/transfer-learning
Libraries and tools to support Transfer Learning |
|
Emerging |
| 37 |
ChristophReich1996/MaxViT
PyTorch reimplementation of the paper "MaxViT: Multi-Axis Vision... |
|
Emerging |
| 38 |
baraline/convst
Implementation of the Random Dilated Shapelet Transform algorithm along with... |
|
Emerging |
| 39 |
AaltoVision/DGC-Net
A PyTorch implementation of "DGC-Net: Dense Geometric Correspondence Network" |
|
Emerging |
| 40 |
amazon-science/semi-vit
PyTorch implementation of Semi-supervised Vision Transformers |
|
Emerging |
| 41 |
NVlabs/FAN
Official PyTorch implementation of Fully Attentional Networks |
|
Emerging |
| 42 |
DavidLandup0/deepvision
PyTorch and TensorFlow/Keras image models with automatic weight conversions... |
|
Emerging |
| 43 |
daniel-code/TubeViT
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse... |
|
Emerging |
| 44 |
FrancescoSaverioZuppichini/ViT
Implementing Vi(sion)T(transformer) |
|
Emerging |
| 45 |
SunghwanHong/Cost-Aggregation-transformers
Official implementation of CATs |
|
Emerging |
| 46 |
apple/parameterized-transforms
torchvision-based transforms that provide access to parameterization |
|
Emerging |
| 47 |
NU-CUCIS/CrossPropertyTL
Cross-property Deep Transfer Learning |
|
Emerging |
| 48 |
YifanXu74/Evo-ViT
Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision... |
|
Emerging |
| 49 |
iduta/coconv
[ICCV W] Contextual Convolutional Neural Networks... |
|
Emerging |
| 50 |
ViTAE-Transformer/ViTAE-Transformer
The official repo for [NeurIPS'21] "ViTAE: Vision Transformer Advanced by... |
|
Emerging |
| 51 |
GuanRunwei/Awesome-Vision-Transformer-Collection
Variants of Vision Transformer and its downstream tasks |
|
Emerging |
| 52 |
JoanaR/multi-mode-CNN-pytorch
A PyTorch implementation of the Multi-Mode CNN to reconstruct Chlorophyll-a... |
|
Emerging |
| 53 |
MosbehBarhoumiRAI/VITON-PRE-PROCESSING
This repository contains the initial implementation of pre-processing for... |
|
Emerging |
| 54 |
pavlo-melnyk/mlgp-embedme
The official implementation of the "Embed Me If You Can: A Geometric... |
|
Emerging |
| 55 |
xiusu/ViTAS
Code for ViTAS_Vision Transformer Architecture Search |
|
Emerging |
| 56 |
shikishima-TasakiLab/Involution-PyTorch
Unofficial PyTorch reimplemention of the paper "Involution: Inverting the... |
|
Emerging |
| 57 |
AnkurDeria/MFT
Pytorch implementation of Multimodal Fusion Transformer for Remote Sensing... |
|
Emerging |
| 58 |
insitro/ContextViT
Contextual Vision Transformers for Robust Representation Learning |
|
Emerging |
| 59 |
graldij/transformer-fusion
Official repository of the "Transformer Fusion with Optimal Transport"... |
|
Emerging |
| 60 |
benbergner/cropr
A token pruning method that accelerates ViTs for various tasks while... |
|
Emerging |
| 61 |
shashankvkt/DoRA_ICLR24
This repo contains the official implementation of ICLR 2024 paper "Is... |
|
Emerging |
| 62 |
paulgavrikov/CNN-Filter-DB
A database of over 1.4 billion 3x3 convolution filters extracted from... |
|
Emerging |
| 63 |
ViTAE-Transformer/ViTAE-VSA
The official repo for [ECCV'22] "VSA: Learning Varied-Size Window Attention... |
|
Emerging |
| 64 |
jman4162/PyTorch-Vision-Transformers-ViT
Explore fine-tuning the Vision Transformer (ViT) model for object... |
|
Emerging |
| 65 |
nerminnuraydogan/vision-transformer
Vision Transformer explanation and implementation with PyTorch |
|
Emerging |
| 66 |
altndrr/vic
Code implementation of our NeurIPS 2023 paper: Vocabulary-free Image Classification |
|
Emerging |
| 67 |
billpsomas/simpool
This repo contains the official implementation of ICCV 2023 paper "Keep It... |
|
Experimental |
| 68 |
Rishit-dagli/Transformer-in-Transformer
An Implementation of Transformer in Transformer in TensorFlow for image... |
|
Experimental |
| 69 |
mako443/Text2Pos-CVPR2022
Code, dataset and models for our CVPR 2022 publication "Text2Pos" |
|
Experimental |
| 70 |
alantess/transformer
Implementation of a modified vision transformer on the crypto market space |
|
Experimental |
| 71 |
EthanBnntt/tinygrad-vit
A minimalist implementation of the ViT (Vision Transformer) model, using tinygrad |
|
Experimental |
| 72 |
ViTAE-Transformer/LeMeViT
The official repo for [IJCAI'24] "LeMeViT: Efficient Vision Transformer with... |
|
Experimental |
| 73 |
RohanG9929/LoFTR-in-Tensorflow
Code for our re-implementation of "LoFTR: Detector-Free Local Feature... |
|
Experimental |
| 74 |
materight/RepNet-pytorch
A PyTorch port with pre-trained weights of RepNet, from "Counting Out Time:... |
|
Experimental |
| 75 |
PegHeads-Inc/PegHeads-Tutorial-4
TRANSFER LEARNING: TO CREATE A PRE-TRAINED MODEL |
|
Experimental |
| 76 |
EmPasLab/ExMobileVIT
ExMobileViT: Lightweight Classifier Extension for Mobile Vision Transformer |
|
Experimental |
| 77 |
janaalbader28/Waste-Classification-ViT
Exploring the use of Vision Transformers (ViT) for waste classification |
|
Experimental |
| 78 |
suous/RecNeXt
RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations |
|
Experimental |
| 79 |
sanket-poojary-03/Fine-tuning-ViVit
Python script to fine tune Open source Video Vision Transformer (ViVit)... |
|
Experimental |
| 80 |
lizhh268/FSSUWNet
[IJCNN 2025 Oral] Official implementation of paper: FSSUWNet: Mitigating the... |
|
Experimental |
| 81 |
WalterSimoncini/fungivision
Library implementation of "No Train, all Gain: Self-Supervised Gradients... |
|
Experimental |
| 82 |
BobMcDear/vit-pytorch
PyTorch implementation of the vision transformer |
|
Experimental |
| 83 |
zhouchenlin2096/Awesome-Transformer-for-Vision-Recognition
A comprehensive paper list of Transformer & Attention for Vision Recognition... |
|
Experimental |
| 84 |
chinefed/convolutional-set-transformer
Official implementation of the Convolutional Set Transformer (Chinello &... |
|
Experimental |
| 85 |
zs1314/Fraesormer
【ICME2025 Oral】Offical Pytorch Code for "Fraesormer: Learning Adaptive... |
|
Experimental |
| 86 |
rentainhe/ViT.pytorch
The Pytorch reimplementation of Vision Transformer |
|
Experimental |
| 87 |
Tejeshyewale/transfer_learning_in_Deeplearning
This project demonstrates image classification using transfer learning with... |
|
Experimental |
| 88 |
EvgenyKashin/non-leaking-conv
Implementation of Spectral Leakage and Rethinking the Kernel Size in CNNs in Pytorch |
|
Experimental |
| 89 |
Atharv279/Transfer-Learning
Files containing projects related to Transfer Learning |
|
Experimental |
| 90 |
AliKHaliliT/MobileViViT
MobileViViT, a higher dimensional adaptation of MobileViT |
|
Experimental |
| 91 |
jiaowoguanren0615/DINOV2-Pytorch
This is a warehouse for DinoV2-models, based pytorch framework. |
|
Experimental |
| 92 |
dabane-ghassan/int-lab-book
Foveated Spatial Transformers |
|
Experimental |
| 93 |
MohammadRoodbari/Image-Classification
image classification with fine tuning the BEiT vision transformer on CIFAR 10 dataset |
|
Experimental |
| 94 |
VikramRangarajan/SIEDD
A fast coordinate-based neural video encoder |
|
Experimental |
| 95 |
nick8592/ViT-Classification-CIFAR10
This repository contains an implementation of the Vision Transformer (ViT)... |
|
Experimental |
| 96 |
lucasjvds/ViT-for-Dark-Matter-Morphology
Under the international Google Summer of Code program, the project... |
|
Experimental |
| 97 |
OSU-MLB/ViT_PEFT_Vision
[CVPR'25 (Highlight)] Lessons and Insights from a Unifying Study of... |
|
Experimental |
| 98 |
aimagelab/TransFusion
Official codebase of "Update Your Transformer to the Latest Release:... |
|
Experimental |
| 99 |
iijumanaAhmed/Waste-Classification-ViT
Exploring the use of Vision Transformers (ViT) for waste classification |
|
Experimental |
| 100 |
techsup93/CIFAR10-CNN-vs-ViT
🔍 Comparing CNN vs Vision Transformer (ViT) on CIFAR-10 with GPU T4 | Deep... |
|
Experimental |
| 101 |
sntsemilio/Transfer-learning
A machine learning project focused on transfer learning techniques using... |
|
Experimental |
| 102 |
mahshid1378/SwinTransformerPytorch
Implementation of the Swin Transformer in PyTorch. and use Article:... |
|
Experimental |
| 103 |
justanhduc/involution
A Pytorch CUDA/C++ JIT implementation with Python wrapper of Involution |
|
Experimental |