sayakpaul/probing-vits

Probing the representations of Vision Transformers.

/ 100

Emerging

This project offers tools to understand how Vision Transformer (ViT) models 'see' and process images or videos. By taking your image or video input, it generates visualizations like attention maps and heatmaps, showing which parts of the input the model focused on. This helps researchers and AI practitioners analyze and debug the internal workings of various ViT architectures.

340 stars. No commits in the last 6 months.

Use this if you are a machine learning researcher or practitioner who wants to visualize and interpret the internal representations of different Vision Transformer models, particularly for understanding their attention mechanisms on images or videos.

Not ideal if you are looking for novel methods for probing neural networks or need to train and visualize ViTs with very small datasets, as these features are not the focus or are still under development.

computer-vision deep-learning-interpretability AI-model-analysis image-recognition-research video-analysis

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 12 / 25

How are scores calculated?

Stars

340

Forks

Language

Jupyter Notebook

License

Apache-2.0

Higher-rated alternatives

jaehyunnn/ViTPose_pytorch

An unofficial implementation of ViTPose [Y. Xu et al., 2022]

UdbhavPrasad072300/Transformer-Implementations

Library - Vanilla, ViT, DeiT, BERT, GPT

tintn/vision-transformer-from-scratch

A Simplified PyTorch Implementation of Vision Transformer (ViT)

icon-lab/ResViT

Official Implementation of ResViT: Residual Vision Transformers for Multi-modal Medical Image Synthesis

gupta-abhay/pytorch-vit

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Explore Transformer Models

All categories Trending Transformer directory Insights