Vision Transformer Optimization Computer Vision Tools

There are 31 vision transformer optimization tools tracked. 1 score above 50 (established tier). The highest-rated is BR-IDL/PaddleViT at 51/100 with 1,241 stars.

Get all 31 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=computer-vision&subcategory=vision-transformer-optimization&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	BR-IDL/PaddleViT :robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for...	51	Established	1,241	Python
2	pathak22/unsupervised-video [CVPR 2017] Unsupervised deep learning using unlabelled videos on the web	48	Emerging	261	Lua
3	IBM/CrossViT Official implementation of CrossViT. https://arxiv.org/abs/2103.14899	45	Emerging	414	Python
4	NVlabs/GCVit [ICML 2023] Official PyTorch implementation of Global Context Vision Transformers	43	Emerging	447	Python
5	ViTAE-Transformer/ViTDet Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer...	42	Emerging	579	Python
6	bytedance/SPTSv2 The official implementation of SPTS v2: Single-Point Text Spotting	41	Emerging	139	Python
7	wjun0830/QD-DETR Official pytorch repository for "QD-DETR : Query-Dependent Video...	41	Emerging	246	Python
8	Seokju-Cho/Volumetric-Aggregation-Transformer Official Implementation of VAT	39	Emerging	159	Python
9	PediaMedAI/ViTASD [ICASSP 2023] Official Implementation of ViTASD: Robust Vision Transformer...	39	Emerging	29	Python
10	amazon-science/glass-text-spotting Official implementation for "GLASS: Global to Local Attention for Scene-Text...	39	Emerging	102	Python
11	insitro/ChannelViT Channel Vision Transformers: An Image Is Worth C x 16 x 16 Words	38	Emerging	71	Python
12	ViTAE-Transformer/QFormer The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"	37	Emerging	235	Python
13	kkakkkka/ETRIS [ICCV-2023] The official code of Bridging Vision and Language Encoders:...	36	Emerging	138	Python
14	dlut-dimt/ReCoNet ECCV 2022 \| Recurrent Correction Network for Fast and Efficient...	34	Emerging	50	Python
15	Haochen-Wang409/DropPos [NeurIPS'23] DropPos: Pre-Training Vision Transformers by Reconstructing...	32	Emerging	62	Python
16	dimiz51/FaceViT FaceViT: A multi-task Vision Transformer for face detection, age estimation,...	32	Emerging	4	Jupyter Notebook
17	altndrr/vicss Code implementation of our paper: Vocabulary-free Image Classification and...	32	Emerging	5	Python
18	demidovd98/sm-vit Official repository for the paper "Salient Mask-Guided Vision Transformer...	30	Emerging	21	Python
19	maclong01/DeBiFormer [ACCV 2024 ] Official code for "DeBiFormer: Vision Transformer with...	29	Experimental	32	Python
20	Lahdhirim/CV-human-pose-classifier-ViT-aws Human Pose Classifier using Vision Transformers (ViT) – end-to-end pipeline...	29	Experimental	5	Python
21	gianlucarloni/CoCoReco Code base for our paper "Connectivity-Inspired Network for Context-Aware...	29	Experimental	7	Python
22	ViLab-UCSD/MemSAC_ECCV2022 PyTorch code for MemSAC. To appear in ECCV 2022.	28	Experimental	8	Jupyter Notebook
23	d3tk/REOrder Does patch ordering affect context-limited vision transformers?	28	Experimental	17	Python
24	ViTAE-Transformer/ViTAE-Transformer-Scene-Text-Detection A comprehensive list [Hi-SAM@TPAMI'24, GoMatching@NeurIPS'24, DeepSolo(++)@...	25	Experimental	93	TeX
25	LeapLabTHU/DAT-Segmentation Repository of Vision Transformer with Deformable Attention (CVPR2022) and...	25	Experimental	26	Python
26	LeapLabTHU/DAT-Detection Repository of Vision Transformer with Deformable Attention (CVPR2022) and...	22	Experimental	20	Python
27	lorebianchi98/FG-OVD [CVPR 2024 Highlight] Official repository of the paper "The devil is in the...	22	Experimental	67	Python
28	ruohaoguo/pavsodr Official Implementation of "Instance-Level Panoramic Audio-Visual Saliency...	21	Experimental	11	Python
29	ruohaoguo/ovavss Official Implementation of "Open-Vocabulary Audio-Visual Semantic...	21	Experimental	35	Python
30	fangevo/ViT-LSTM-Foot-Contact-Detection Official implementation of a hybrid ViT-BiLSTM framework for fine-grained...	15	Experimental	13	Python
31	OmarAlsaqa/GeoViG Implementation for GeoViG: Geometry-Aware Graph Reasoning for Mobile Vision...	13	Experimental	—	Python