OCR Document Extraction Transformer Models

Tools for extracting text and structured data from images, PDFs, and documents using transformer-based OCR models. Does NOT include general document analysis, LLM-based summarization, or post-extraction processing (summarization/Q&A).

There are 49 ocr document extraction models tracked. 4 score above 50 (established tier). The highest-rated is clusterzx/paperless-ai at 57/100 with 5,410 stars. 1 of the top 10 are actively maintained.

Get all 49 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=ocr-document-extraction&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 clusterzx/paperless-ai

An automated document analyzer for Paperless-ngx using OpenAI API, Ollama,...

57
Established
2 kha-white/manga-ocr

Optical character recognition for Japanese text, with the main focus being...

54
Established
3 alephpi/Texo-web

The web application for Texo, a minimalist SOTA LaTeX OCR model which...

51
Established
4 bytefer/ollama-ocr

Implementing OCR with a local visual model run by ollama.

50
Established
5 alephpi/Texo

A minimalist SOTA LaTeX OCR model with only 20M parameters, running in...

49
Emerging
6 Dartvauder/NeuroSandboxWebUI

(Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image,...

42
Emerging
7 FreeOCR-AI/layoutreader

A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.

42
Emerging
8 samestrin/llm-pdf-ocr-api

A Python-based REST API for PDF OCR using AI models with PyTorch and...

40
Emerging
9 JonSnow1807/Medical-Prescription-OCR

OCR system for handwritten medical prescriptions using Donut transformer and...

37
Emerging
10 neosantara-xyz/glm-ocr-inference

Fast and lightweight GLM-OCR inference on Modal with an OpenAI-compatible...

36
Emerging
11 CYFARE/PDXTRACT

Extract From PDF's Using Ollama Local LLM

32
Emerging
12 sitammeur/gliner-litserve

Leverage ModernGLiNER's capabilities using LitServe.

30
Emerging
13 lucky-verma/SaastIE

Document understanding system using Donut transformer architecture

30
Emerging
14 Dartvauder/NeuroTrainerWebUI

(Windows/Linux) Local WebUI for finetuning, evaluation and generation of...

29
Experimental
15 Quotify-Bot/quotify-frontend

AI-powered inspirational quote generator

27
Experimental
16 muhammad-fiaz/EMSUGI

EMSUGI is a future prediction & analysis project on various factor like...

27
Experimental
17 inuwamobarak/nougat

Nougat is a Meta AI's revolutionary OCR model designed to transcribe...

25
Experimental
18 Kovelja009/handwriting-recognition

Benchmark of different network architectures for handwritten text recognition.

25
Experimental
19 ToluClassics/LowResourceOCR

This work is an adaptation of CNN+Transformer architecture to training text...

24
Experimental
20 arora-r/gradio-example

This repository is an example of dockerizing a Gradio application which uses...

21
Experimental
21 KadirCanCelik/Handwriting-to-digital

Handwriting to text conversion using line segmentation and OCR techniques

21
Experimental
22 bcastelino/ocr-text-vision-pro

AI-powered OCR application using Free OpenRouter Vision Models for advanced...

21
Experimental
23 PRITHIVSAKTHIUR/dots.ocr-fix-demo

This Gradio application demonstrates the capabilities of the "dots.ocr"...

21
Experimental
24 Metedout-biographer66/dots.ocr-fix-demo

🖼️ Upload images to experience accurate multilingual OCR results with the...

21
Experimental
25 koesan/Manga_Comic_Colorization_and_Translation_v1

AI-powered manga and comic translator using EasyOCR and Hugging Face...

21
Experimental
26 resetpaid/lumina

Perform passive domain reconnaissance using public data sources without...

21
Experimental
27 SemanticWave-Hoyeon/NavtexRecovery

AI-powered restoration system for damaged NAVTEX (NAVigational TEleX)...

21
Experimental
28 AleNard89/py-pytorch-invoice

Automated invoice data extraction using LayoutLMv3 (PyTorch) with PyQt6...

21
Experimental
29 sorcero/ingestum

Read-only mirror of https://gitlab.com/sorcero/community/ingestum

20
Experimental
30 Mustapha-AJEGHRIR/arabic_calligraphy

This is a repo containing our code for Arabic calligraphy style detection...

20
Experimental
31 Mohammed20201991/OCR_HU_Tra2022

HTR Transformer for Hungarian Language

19
Experimental
32 SD7Campeon/Gemma3_OCR_Text_Extractor_LLM

Gemma-3 OCR exemplifies the confluence of abstruse computer vision and...

18
Experimental
33 sitammeur/readerlm-litserve

Leverage Reader-LM's capabilities using LitServe.

13
Experimental
34 stelaras36/OCRfixer

Web & CLI tool to fix noisy OCR text using a fine-tuned T5 model

13
Experimental
35 Parth844/AI_pdf_to_Epub

AI-powered PDF to EPUB conversion engine with LLM-based chapter detection...

13
Experimental
36 Rayyan9477/OCR-Image-to-text

Developed an OCR Image-to-Text application using Python and Streamlit,...

13
Experimental
37 Eduardo-PRg/NLM2Img

🖼️ Combine multi-page PDFs into a seamless image and add custom stamps, all...

13
Experimental
38 connerohnesorge/modal-deepseek-ocr

modal.com deployment of deepseek ocr as a fastapi serverless app

12
Experimental
39 krasimirkostadinov/AI-kyc-ocr-id-validator

An offline AI-powered KYC document processing system that extracts...

12
Experimental
40 sitammeur/paligemma2-docci-litserve

Leverage PaliGemma 2's DOCCI fine-tuned variant capabilities using LitServe.

11
Experimental
41 sitammeur/videollama3-litserve

Leverage VideoLLaMA 3's capabilities using LitServe.

11
Experimental
42 sitammeur/got-ocr-litserve

Leverage GOT-OCR2's optical character recognition capabilities using LitServe.

11
Experimental
43 sunsun8170/YZU-CAPTCHA-TrOCR

A TrOCR-small-printed model fine-tuned on 419,880 CAPTCHAs from the YZU...

11
Experimental
44 sitammeur/modernbert-litserve

Leverage ModernBERT's capabilities using LitServe.

11
Experimental
45 sitammeur/siglip2-litserve

Leverage SigLIP 2's capabilities using LitServe.

11
Experimental
46 sitammeur/gemma3-litserve

Leverage Gemma 3's capabilities using LitServe.

11
Experimental
47 sitammeur/paligemma2-mix-litserve

Leverage PaliGemma 2 mix model variant capabilities using LitServe.

11
Experimental
48 massimilianoviola/visual-translator

Translate objects in images with a click, get contextual sentences and hear...

11
Experimental
49 sitammeur/align-anything-litserve

Leverage Align-DS-V's capabilities using LitServe.

11
Experimental