FrancescoSaverioZuppichini/ViT

Implementing Vi(sion)T(transformer)

/ 100

Emerging

This project offers a guide to implementing a Vision Transformer (ViT) model, which is a powerful tool for image recognition tasks. It takes an input image, breaks it into smaller patches, and then processes these patches to classify or understand the image's content. Data scientists and machine learning engineers working on computer vision problems would use this to build and deploy image recognition systems.

453 stars. No commits in the last 6 months.

Use this if you are a machine learning engineer who needs to implement a Vision Transformer model for image classification, understanding, or other computer vision tasks.

Not ideal if you are looking for a plug-and-play solution without diving into the underlying implementation details.

image-recognition computer-vision deep-learning machine-learning-engineering

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 19 / 25

How are scores calculated?

Stars

453

Forks

Language

—

License

—

Higher-rated alternatives

Jittor/jittor

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

berniwal/swin-transformer-pytorch

Implementation of the Swin Transformer in PyTorch.

zhanghang1989/ResNeSt

ResNeSt: Split-Attention Networks

NVlabs/FasterViT

[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with...

ViTAE-Transformer/ViTPose

The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose...

Explore ML Frameworks

All categories Trending ML Framework directory Insights