demidovd98/sm-vit

Official repository for the paper "Salient Mask-Guided Vision Transformer for Fine-Grained Classification" (VISIGRAPP '23)

/ 100

Emerging

This project offers a powerful tool for automatically recognizing specific sub-categories of objects in images, even when they look very similar. You provide it with images (e.g., photos of different bird species or dog breeds), and it precisely identifies the exact type of bird or dog. This is especially useful for researchers, zoologists, or anyone needing to classify visually similar items with high accuracy.

No commits in the last 6 months.

Use this if you need to accurately differentiate between very similar-looking objects within a broader category, like distinguishing between different types of birds, car models, or dog breeds in images.

Not ideal if you only need general object recognition (e.g., just identifying 'a car' instead of 'a specific car model') or if your images contain objects that are highly dissimilar.

wildlife-identification species-classification product-defect-detection visual-quality-inspection medical-image-analysis

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

BR-IDL/PaddleViT

:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

pathak22/unsupervised-video

[CVPR 2017] Unsupervised deep learning using unlabelled videos on the web

IBM/CrossViT

Official implementation of CrossViT. https://arxiv.org/abs/2103.14899

NVlabs/GCVit

[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers

ViTAE-Transformer/ViTDet

Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object...

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights