3D Vision Transformers Transformer Models

Tools for 3D computer vision tasks using transformers, including depth estimation, multi-view geometry, structure-from-motion, point cloud processing, 3D pose estimation, and novel view synthesis. Does NOT include general 2D vision tasks, 2D pose estimation, or 3D shape generation without vision inputs.

There are 85 3d vision transformers models tracked. 4 score above 50 (established tier). The highest-rated is NVlabs/MambaVision at 63/100 with 2,060 stars.

Get all 85 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=3d-vision-transformers&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 NVlabs/MambaVision

[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid...

63
Established
2 sign-language-translator/sign-language-translator

Python library & framework to build custom translators for the...

58
Established
3 kyegomez/Jamba

PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"

56
Established
4 autonomousvision/transfuser

[PAMI'23] TransFuser: Imitation with Transformer-Based Sensor Fusion for...

55
Established
5 kyegomez/MultiModalMamba

A novel implementation of fusing ViT with Mamba into a fast, agile, and high...

49
Emerging
6 dali92002/DocEnTR

DocEnTr: An end-to-end document image enhancement transformer - ICPR 2022

47
Emerging
7 fashn-AI/fashn-human-parser

Human parsing model for fashion and virtual try-on applications

47
Emerging
8 buaacyw/MeshAnything

[ICLR 2025] From anything to mesh like human artists. Official impl. of...

44
Emerging
9 buaacyw/MeshAnythingV2

[ICCV 2025] From anything to mesh like human artists. Official impl. of...

44
Emerging
10 linjieli222/HERO

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for...

44
Emerging
11 csiro-robotics/HOTFormerLoc

[IEEE/CVF CVPR 2025] Hierarchical Octree Transformer for Versatile Lidar...

43
Emerging
12 wgcban/HyperTransformer

[CVPR'22] HyperTransformer: A Textural and Spectral Feature Fusion...

41
Emerging
13 PediaMedAI/AggPose

[IJCAI 2022] Official PyTorch implementation of AggPose: Deep Aggregation...

40
Emerging
14 AllenXiangX/SnowflakeNet

(TPAMI 2023) Snowflake Point Deconvolution for Point Cloud Completion and...

40
Emerging
15 snktshrma/ngps_flight

Global vision positioning system for UAVs in outdoor GNSS-denied environments

40
Emerging
16 jhcho99/GSRTR

[BMVC'21] Official PyTorch Implementation of "Grounded Situation Recognition...

40
Emerging
17 ChenRocks/UNITER

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt...

39
Emerging
18 AyushExel/trolo

An SDK for Transformers + YOLO and other SSD family models

39
Emerging
19 padeler/PE-former

2D Human Pose estimation using transformers. Implementation in Pytorch

39
Emerging
20 xingyizhou/GTR

Global Tracking Transformers, CVPR 2022

38
Emerging
21 hasanirtiza/PedesFormer-Transformer-Networks-For-Pedestrian-Detection

Transformer Networks for Pedestrian Detection

38
Emerging
22 icon-lab/SLATER

Official implementation of the paper: Unsupervised MRI Reconstruction via...

38
Emerging
23 jhcho99/CoFormer

[CVPR'22] Official PyTorch Implementation of "Collaborative Transformers for...

37
Emerging
24 VachanVY/Transfusion.torch

PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse...

37
Emerging
25 kyegomez/AudioMamba

Implementation of the paper: "Audio Mamba: Bidirectional State Space Model...

37
Emerging
26 desaixie/zeroverse

Official code for NeurIPS 2024 paper LRM-Zero: Training Large Reconstruction...

37
Emerging
27 yihongXU/TransCenter

This is the official implementation of TransCenter (TPAMI). The code and...

36
Emerging
28 kyegomez/MambaDecoderBlock

MambaDecoderBlock is a novel decoder architecture that replaces traditional...

35
Emerging
29 DEV-D-GR8/SignSense

This repository contains a transformer-based model for real-time American...

35
Emerging
30 sam575/axial-gan

Code for "Simultaneous Face Hallucination and Translation for Thermal to...

35
Emerging
31 AndrewBoessen/PerfectRep

PerfectRep is a 3D pose estimation model tailored specifically for...

33
Emerging
32 kyegomez/VLM-Mamba

We introduce VLM-Mamba, the first Vision-Language Model built entirely on...

32
Emerging
33 ShengcaiLiao/TransMatcher

[NeurIPS 2021] TransMatcher: Deep Image Matching Through Transformers for...

32
Emerging
34 XunshanMan/MVGFormer

This is the official implementation of the work presented at CVPR 2024,...

32
Emerging
35 zubair-irshad/NeRF-MAE

[ECCV 2024] Pytorch code for our ECCV'24 paper NeRF-MAE: Masked AutoEncoders...

32
Emerging
36 xmartlabs/spoter-embeddings

Create embeddings from sign pose videos using Transformers

32
Emerging
37 Merterm/Modeling-Intensification-for-SLG

Public repo for the paper: "Modeling Intensification for Sign Language...

31
Emerging
38 NeurAI-Lab/MT-SfMLearner

Official code for 'Transformers in Unsupervised Structure-from-Motion' and...

31
Emerging
39 bhanuprathap2000/sign-language-recognition

This repo contains the code for sign-language-recognition as part of our...

31
Emerging
40 hukenovs/slovo

Slovo: Russian Sign Language Dataset and Models

30
Emerging
41 GregorKobsik/ImageTransformer

This notebook shows a basic implementation of a transformer (decoder)...

30
Emerging
42 kyegomez/Simba

A simpler Pytorch + Zeta Implementation of the paper: "SiMBA: Simplified...

30
Emerging
43 eslambakr/LAR-Look-Around-and-Refer

This is the official implementation for our paper;"LAR:Look Around and Refer".

29
Experimental
44 lamm-mit/FieldCompleter

GAN/convolutional and Transformer models to predict missing mechanical...

29
Experimental
45 loubnabnl/Sign-Segmentation-with-Transformers

Detection of temporal boundaries in sign language videos, as part of the...

29
Experimental
46 tthinking/MATR

[IEEE TIP 2022] Official implementation of MATR: Multimodal Medical Image...

29
Experimental
47 sauradip/STALE

[ECCV 2022] Official Pytorch Implementation of the paper : " Zero-Shot...

29
Experimental
48 xiuqhou/DAPE

[AAAI2026] Official implementation of the paper "DAPE: Harmonizing...

27
Experimental
49 sauradip/fewshotQAT

[BMVC 2021]: Official PyTorch implementation of : "Few Shot Temporal Action...

26
Experimental
50 kyegomez/SimpleMamba

Implementation of a modular, high-performance, and simplistic mamba for...

26
Experimental
51 exitudio/GaitMixer

Official repository for "GaitMixer: Skeleton-based Gait Representation...

25
Experimental
52 icon-lab/TranSMS

Official Implementation of Transformers for System Matrix Super-resolution (TranSMS)

24
Experimental
53 musialski-lab/LayoutEnhancer

Source code for the Paper: Layout Enahancer

23
Experimental
54 AshutoshKulkarni4998/AIDTransformer

Inference code for "Aerial Image Dehazing with Attentive Deformable...

22
Experimental
55 albrateanu/KANT

[Sensors 2025] Enhancing Low-Light Images with Kolmogorov–Arnold Networks in...

22
Experimental
56 mabdn/feasible-interpretable-trajectory-prediction

A Transformer neural network for autonomous driving to predict the future...

22
Experimental
57 mustafa1728/Person-Re-ID

Experiments on some existing Re-ID methods on a different dataset with...

22
Experimental
58 artem-gorodetskii/TransPix2Pix

Rethinking the Pix2Pix architecture with attention mechanisms and transformers.

22
Experimental
59 LookUpMark/dylem-grid

DYLEM-GRID is a deep learning project for dynamic hand gesture recognition...

22
Experimental
60 RisabBiswas/T2T-BinFormer

SOTA Document Image Enhancement - T2T-BinFormer: Effective Document Image...

21
Experimental
61 arafathosense/Real-Time-Face-Glitch-Effect-Controlled-by-Hand-Gestures

A real-time interactive computer vision art project using OpenCV. Control a...

21
Experimental
62 Abdullah-Shah-26/Sign-Cast

Real-time AI-powered voice-to-sign language translator. Converts speech to...

21
Experimental
63 HowieMa/PPT

[ECCV 2022] "PPT: token-Pruned Pose Transformer for monocular and multi-view...

21
Experimental
64 Microsatellites-and-Space-Microsystems/pose_estimation_domain_gap

Two methods for solving domain gap in satellite pose estimation in space...

21
Experimental
65 gmongaras/2Mamba2Furious

Code for the paper "2Mamba2Furious: Linear in complexity, competitive in accuracy"

20
Experimental
66 freddxvill/Proyecto_Traductor_de_la_LSB

Traductor de Lengua de Señas Boliviana (LSB) a texto utilizando redes...

20
Experimental
67 zwh0527/AGRNet

Code for "Mining Global Relativity Consistency without Neighborhood Modeling...

19
Experimental
68 aliebayani/TransGAN-DX

A Hybrid Transformer-GAN Approach for Cardiovascular Disease Diagnosis

19
Experimental
69 anupvna/street-view-geolocation

Multi-view Deep Learning pipeline using PyTorch to predict global...

19
Experimental
70 GregorKobsik/Octree-Transformer

Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically...

19
Experimental
71 junayed-hasan/spontaneous-smile-recognition

A deep learning framework for distinguishing spontaneous from posed smiles...

19
Experimental
72 tthinking/SETFusion

[PR 2026] Official implementation of SETFusion: A Semantic Transformer for...

18
Experimental
73 rukmini-17/scalable-sequence-modeling

Comparative analysis of Mamba vs. Transformers trained from scratch....

17
Experimental
74 codedmachine111/Image_generation_using_transformers_in_GANs

Image Generation using Transformers in GANs

17
Experimental
75 ImKeTT/ReSee

[EMNLP'23 Oral] ReSee: Responding through Seeing Fine-grained Visual...

13
Experimental
76 botmahn/slowfast

An unofficial pytorch implementation of "Early Anticipation of Driving...

13
Experimental
77 fabiosilva781/top-cvpr-2025-papers

🌟 Discover top CVPR 2025 papers for insightful research in computer vision,...

13
Experimental
78 Ricardosc97/T-PIE

Pedestrian Intention Estimation using stacked Transformers Encoders

12
Experimental
79 bihani-g/LASeR

Code and Analysis for our paper titled 'Low Anisotropy Sense Retrofitting...

11
Experimental
80 tayo4christ/transformer-gesture

Real-time gesture recognition system using Vision Transformers, ONNX, and...

11
Experimental
81 aditi184/Person_Re-Identification

Person ReIdentification using Locally Aware Transformers

11
Experimental
82 tthinking/EAT

[IEEE TMM 2025] Official implementation of EAT: Multi-Exposure Image Fusion...

11
Experimental
83 Geetanshu0410/Gesture-Bridge

Sign Language Translator GestureBridge is a cutting-edge AI-driven system...

10
Experimental
84 harshavardhan-patil/where-am-i

Transformer backed geo-localizer to find an address in the USA based on...

10
Experimental
85 retkowsky/synthetic_images

Synthetic images with Transformers

10
Experimental

Comparisons in this category