OCR Document Extraction Transformer Models

Tools for extracting text and structured data from images, PDFs, and documents using transformer-based OCR models. Does NOT include general document analysis, LLM-based summarization, or post-extraction processing (summarization/Q&A).

There are 49 ocr document extraction models tracked. 4 score above 50 (established tier). The highest-rated is clusterzx/paperless-ai at 57/100 with 5,410 stars. 1 of the top 10 are actively maintained.

Get all 49 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=ocr-document-extraction&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	clusterzx/paperless-ai An automated document analyzer for Paperless-ngx using OpenAI API, Ollama,...	57	Established	5,410	JavaScript
2	kha-white/manga-ocr Optical character recognition for Japanese text, with the main focus being...	54	Established	2,582	Python
3	alephpi/Texo-web The web application for Texo, a minimalist SOTA LaTeX OCR model which...	51	Established	46	Vue
4	bytefer/ollama-ocr Implementing OCR with a local visual model run by ollama.	50	Established	300	TypeScript
5	alephpi/Texo A minimalist SOTA LaTeX OCR model with only 20M parameters, running in...	49	Emerging	747	Python
6	Dartvauder/NeuroSandboxWebUI (Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image,...	42	Emerging	108	Python
7	FreeOCR-AI/layoutreader A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.	42	Emerging	314	Python
8	samestrin/llm-pdf-ocr-api A Python-based REST API for PDF OCR using AI models with PyTorch and...	40	Emerging	34	Python
9	JonSnow1807/Medical-Prescription-OCR OCR system for handwritten medical prescriptions using Donut transformer and...	37	Emerging	9	Jupyter Notebook
10	neosantara-xyz/glm-ocr-inference Fast and lightweight GLM-OCR inference on Modal with an OpenAI-compatible...	36	Emerging	3	Python
11	CYFARE/PDXTRACT Extract From PDF's Using Ollama Local LLM	32	Emerging	4	Python
12	sitammeur/gliner-litserve Leverage ModernGLiNER's capabilities using LitServe.	30	Emerging	2	Python
13	lucky-verma/SaastIE Document understanding system using Donut transformer architecture	30	Emerging	7	Python
14	Dartvauder/NeuroTrainerWebUI (Windows/Linux) Local WebUI for finetuning, evaluation and generation of...	29	Experimental	9	Python
15	Quotify-Bot/quotify-frontend AI-powered inspirational quote generator	27	Experimental	7	JavaScript
16	muhammad-fiaz/EMSUGI EMSUGI is a future prediction & analysis project on various factor like...	27	Experimental	1	HTML
17	inuwamobarak/nougat Nougat is a Meta AI's revolutionary OCR model designed to transcribe...	25	Experimental	27	Jupyter Notebook
18	Kovelja009/handwriting-recognition Benchmark of different network architectures for handwritten text recognition.	25	Experimental	5	Jupyter Notebook
19	ToluClassics/LowResourceOCR This work is an adaptation of CNN+Transformer architecture to training text...	24	Experimental	5	Python
20	arora-r/gradio-example This repository is an example of dockerizing a Gradio application which uses...	21	Experimental	1	Python
21	KadirCanCelik/Handwriting-to-digital Handwriting to text conversion using line segmentation and OCR techniques	21	Experimental	—	Jupyter Notebook
22	bcastelino/ocr-text-vision-pro AI-powered OCR application using Free OpenRouter Vision Models for advanced...	21	Experimental	—	Python
23	PRITHIVSAKTHIUR/dots.ocr-fix-demo This Gradio application demonstrates the capabilities of the "dots.ocr"...	21	Experimental	2	Jupyter Notebook
24	Metedout-biographer66/dots.ocr-fix-demo 🖼️ Upload images to experience accurate multilingual OCR results with the...	21	Experimental	—	Jupyter Notebook
25	koesan/Manga_Comic_Colorization_and_Translation_v1 AI-powered manga and comic translator using EasyOCR and Hugging Face...	21	Experimental	3	Python
26	resetpaid/lumina Perform passive domain reconnaissance using public data sources without...	21	Experimental	—	Python
27	SemanticWave-Hoyeon/NavtexRecovery AI-powered restoration system for damaged NAVTEX (NAVigational TEleX)...	21	Experimental	—	Vue
28	AleNard89/py-pytorch-invoice Automated invoice data extraction using LayoutLMv3 (PyTorch) with PyQt6...	21	Experimental	—	Python
29	sorcero/ingestum Read-only mirror of https://gitlab.com/sorcero/community/ingestum	20	Experimental	7	Python
30	Mustapha-AJEGHRIR/arabic_calligraphy This is a repo containing our code for Arabic calligraphy style detection...	20	Experimental	8	Jupyter Notebook
31	Mohammed20201991/OCR_HU_Tra2022 HTR Transformer for Hungarian Language	19	Experimental	1	Jupyter Notebook
32	SD7Campeon/Gemma3_OCR_Text_Extractor_LLM Gemma-3 OCR exemplifies the confluence of abstruse computer vision and...	18	Experimental	1	Python
33	sitammeur/readerlm-litserve Leverage Reader-LM's capabilities using LitServe.	13	Experimental	—	Python
34	stelaras36/OCRfixer Web & CLI tool to fix noisy OCR text using a fine-tuned T5 model	13	Experimental	—	Python
35	Parth844/AI_pdf_to_Epub AI-powered PDF to EPUB conversion engine with LLM-based chapter detection...	13	Experimental	—	Python
36	Rayyan9477/OCR-Image-to-text Developed an OCR Image-to-Text application using Python and Streamlit,...	13	Experimental	4	Python
37	Eduardo-PRg/NLM2Img 🖼️ Combine multi-page PDFs into a seamless image and add custom stamps, all...	13	Experimental	—	TypeScript
38	connerohnesorge/modal-deepseek-ocr modal.com deployment of deepseek ocr as a fastapi serverless app	12	Experimental	1	Python
39	krasimirkostadinov/AI-kyc-ocr-id-validator An offline AI-powered KYC document processing system that extracts...	12	Experimental	2	TypeScript
40	sitammeur/paligemma2-docci-litserve Leverage PaliGemma 2's DOCCI fine-tuned variant capabilities using LitServe.	11	Experimental	—	Python
41	sitammeur/videollama3-litserve Leverage VideoLLaMA 3's capabilities using LitServe.	11	Experimental	—	Python
42	sitammeur/got-ocr-litserve Leverage GOT-OCR2's optical character recognition capabilities using LitServe.	11	Experimental	—	Python
43	sunsun8170/YZU-CAPTCHA-TrOCR A TrOCR-small-printed model fine-tuned on 419,880 CAPTCHAs from the YZU...	11	Experimental	—	Python
44	sitammeur/modernbert-litserve Leverage ModernBERT's capabilities using LitServe.	11	Experimental	—	Python
45	sitammeur/siglip2-litserve Leverage SigLIP 2's capabilities using LitServe.	11	Experimental	—	Python
46	sitammeur/gemma3-litserve Leverage Gemma 3's capabilities using LitServe.	11	Experimental	—	Python
47	sitammeur/paligemma2-mix-litserve Leverage PaliGemma 2 mix model variant capabilities using LitServe.	11	Experimental	—	Python
48	massimilianoviola/visual-translator Translate objects in images with a click, get contextual sentences and hear...	11	Experimental	3	Python
49	sitammeur/align-anything-litserve Leverage Align-DS-V's capabilities using LitServe.	11	Experimental	—	Python

Comparisons in this category

Texo-web and Texo (51 vs 49) NeuroSandboxWebUI and NeuroTrainerWebUI (42 vs 29)