lucidrains/vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

75
/ 100
Verified

This project offers an accessible implementation of Vision Transformers (ViT) in PyTorch, enabling practitioners to classify images with high accuracy. It takes raw image data as input and outputs classifications, indicating what objects or features are present in the image. This is for machine learning engineers and researchers looking to apply advanced vision models to their image classification tasks.

24,988 stars. Used by 2 other packages. Actively maintained with 4 commits in the last 30 days. Available on PyPI.

Use this if you are developing computer vision systems and need a flexible, state-of-the-art approach to image classification.

Not ideal if you are a beginner with no experience in Python or deep learning frameworks, as it requires coding knowledge to implement.

image-classification computer-vision deep-learning machine-learning-research
Maintenance 16 / 25
Adoption 12 / 25
Maturity 25 / 25
Community 22 / 25

How are scores calculated?

Stars

24,988

Forks

3,479

Language

Python

License

MIT

Last pushed

Mar 27, 2026

Commits (30d)

4

Dependencies

3

Reverse dependents

2

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/lucidrains/vit-pytorch"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.