sovit-123/vision_transformers
Vision Transformers for image classification, image segmentation, and object detection.
This project helps computer vision practitioners train models to automatically identify objects, classify images, or segment images into meaningful regions. You provide it with images or video data, and it outputs a trained model capable of performing these tasks or shows the detected objects/classifications on your input. It's designed for machine learning engineers, data scientists, and researchers working with visual data.
Available on PyPI.
Use this if you need to quickly build and deploy robust image classification, object detection, or image segmentation models using state-of-the-art Vision Transformers and DETR architectures.
Not ideal if you are looking for a no-code solution or a tool for general-purpose data analysis outside of computer vision.
Stars
65
Forks
9
Language
Python
License
MIT
Category
Last pushed
Oct 29, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/sovit-123/vision_transformers"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related models
Kohulan/DECIMER-Image_Transformer
DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of...
fcakyon/video-transformers
Easiest way of fine-tuning HuggingFace video classification models
leaderj1001/BottleneckTransformers
Bottleneck Transformers for Visual Recognition
qubvel/transformers-notebooks
Inference and fine-tuning examples for vision models from 🤗 Transformers
rishikksh20/convolution-vision-transformers
PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers