nerminnuraydogan/vision-transformer
Vision Transformer explanation and implementation with PyTorch
This project helps machine learning practitioners understand how Vision Transformers classify images. It takes an image as input and processes it by splitting it into patches, embedding them, and passing them through a Transformer Encoder. The output is the predicted class of the image, making it useful for those who build or study image recognition systems.
No commits in the last 6 months.
Use this if you are a machine learning engineer or researcher who wants to learn the inner workings and implement a Vision Transformer model for image classification.
Not ideal if you are looking for a plug-and-play image classification tool or a general-purpose computer vision library without needing to understand the model architecture.
Stars
67
Forks
9
Language
Jupyter Notebook
License
—
Category
Last pushed
Nov 11, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/nerminnuraydogan/vision-transformer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Jittor/jittor
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
berniwal/swin-transformer-pytorch
Implementation of the Swin Transformer in PyTorch.
zhanghang1989/ResNeSt
ResNeSt: Split-Attention Networks
NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with...
ViTAE-Transformer/ViTPose
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose...