FrancescoSaverioZuppichini/ViT
Implementing Vi(sion)T(transformer)
This project offers a guide to implementing a Vision Transformer (ViT) model, which is a powerful tool for image recognition tasks. It takes an input image, breaks it into smaller patches, and then processes these patches to classify or understand the image's content. Data scientists and machine learning engineers working on computer vision problems would use this to build and deploy image recognition systems.
453 stars. No commits in the last 6 months.
Use this if you are a machine learning engineer who needs to implement a Vision Transformer model for image classification, understanding, or other computer vision tasks.
Not ideal if you are looking for a plug-and-play solution without diving into the underlying implementation details.
Stars
453
Forks
63
Language
—
License
—
Category
Last pushed
Mar 19, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/FrancescoSaverioZuppichini/ViT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Jittor/jittor
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
berniwal/swin-transformer-pytorch
Implementation of the Swin Transformer in PyTorch.
zhanghang1989/ResNeSt
ResNeSt: Split-Attention Networks
NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with...
ViTAE-Transformer/ViTPose
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose...