ViTAE-Transformer/ViTAE-Transformer-Scene-Text-Detection

A comprehensive list [Hi-SAM@TPAMI'24, GoMatching@NeurIPS'24, DeepSolo(++)@ CVPR'23, DPText-DETR@AAAI'23, I3CL@IJCV'22] of our research works related to scene text detection, spotting, etc., including papers, codes.

/ 100

Experimental

This project offers tools to precisely identify and extract text from images and videos, including complex scenarios like curved or multilingual text and hierarchical structures (strokes, words, lines, paragraphs). It takes an image or video as input and outputs the detected text, often with bounding box or segmentation masks. This is for researchers and developers working on advanced computer vision applications involving optical character recognition (OCR) in real-world environments.

No commits in the last 6 months.

Use this if you need to perform highly accurate scene text detection, spotting, or hierarchical text segmentation from various image and video sources, especially when dealing with challenging text forms.

Not ideal if you're looking for a simple, off-the-shelf OCR solution for document scanning or basic text extraction from clean images, as this focuses on complex scene text research.

scene-text-detection video-text-spotting hierarchical-text-segmentation optical-character-recognition computer-vision-research

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 8 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

TeX

License

—

Higher-rated alternatives

BR-IDL/PaddleViT

:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

pathak22/unsupervised-video

[CVPR 2017] Unsupervised deep learning using unlabelled videos on the web

IBM/CrossViT

Official implementation of CrossViT. https://arxiv.org/abs/2103.14899

NVlabs/GCVit

[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers

ViTAE-Transformer/ViTDet

Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object...

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights