LeapLabTHU/DAT-Detection
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Attention
This project offers advanced tools for precisely identifying and segmenting objects within images. It takes raw image data and outputs detailed bounding box coordinates for detected objects, along with pixel-level masks for each instance. This is ideal for computer vision researchers and engineers who need to develop and benchmark high-performance object detection and instance segmentation models.
No commits in the last 6 months.
Use this if you are building state-of-the-art computer vision systems that require highly accurate object localization and segmentation, and you need to leverage the latest advancements in Vision Transformers.
Not ideal if you are a beginner looking for a simple, out-of-the-box solution without deep understanding of model training or evaluation, or if your primary need is general image classification.
Stars
20
Forks
2
Language
Python
License
—
Category
Last pushed
Apr 17, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/LeapLabTHU/DAT-Detection"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
BR-IDL/PaddleViT
:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+
pathak22/unsupervised-video
[CVPR 2017] Unsupervised deep learning using unlabelled videos on the web
IBM/CrossViT
Official implementation of CrossViT. https://arxiv.org/abs/2103.14899
NVlabs/GCVit
[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers
ViTAE-Transformer/ViTDet
Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object...