kyegomez/MegaVIT

The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"

/ 100

Emerging

This project offers an open-source implementation of a very large vision transformer model that can classify images into 1000 categories. It takes a raw image as input and outputs a prediction of what the image contains. This is ideal for machine learning researchers and practitioners who need to leverage state-of-the-art image recognition capabilities for various computer vision tasks.

Use this if you are a machine learning researcher or engineer building or experimenting with large-scale image classification and recognition systems.

Not ideal if you are looking for a simple, out-of-the-box solution for common image tasks without deep technical understanding or access to significant computational resources.

image-classification computer-vision deep-learning machine-learning-research image-recognition

No Package No Dependents

Maintenance 10 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 4 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

pairlab/SlotFormer

Code release for ICLR 2023 paper: SlotFormer on object-centric dynamics models

ChristophReich1996/Swin-Transformer-V2

PyTorch reimplementation of the paper "Swin Transformer V2: Scaling Up Capacity and Resolution"...

prismformore/Multi-Task-Transformer

Code of ICLR2023 paper "TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene...

DirtyHarryLYL/Transformer-in-Vision

Recent Transformer-based CV and related works.

uakarsh/latr

Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal...

Explore Transformer Models

All categories Trending Transformer directory Insights