microsoft/CvT
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.
This project helps machine learning engineers and researchers build highly accurate computer vision models more efficiently. It takes collections of labeled images and outputs a trained model capable of classifying new images with state-of-the-art performance, using fewer computational resources. The models generated are particularly effective for tasks like large-scale image classification and can be fine-tuned for various downstream vision applications.
602 stars. No commits in the last 6 months.
Use this if you need to train robust image classification models that offer excellent accuracy while being more efficient in terms of parameters and computational cost than traditional methods.
Not ideal if your primary goal is real-time inference on highly constrained edge devices where every millisecond and byte of memory is critical, as specialized smaller models might be more suitable.
Stars
602
Forks
127
Language
Python
License
MIT
Category
Last pushed
May 16, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/microsoft/CvT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
Jittor/jittor
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
zhanghang1989/ResNeSt
ResNeSt: Split-Attention Networks
berniwal/swin-transformer-pytorch
Implementation of the Swin Transformer in PyTorch.
NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with...
ViTAE-Transformer/ViTPose
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose...