junhoyeo/BetterOCR

🔍 Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with 🧠 LLM.

/ 100

Emerging

Struggling with inaccurate text extraction from images or documents, especially in non-English languages? This tool helps you get much more reliable text by combining several OCR (Optical Character Recognition) tools and then using AI to clean up and correct any errors. It takes an image or document as input and provides accurate, readable text, ideal for anyone who needs to digitize information from physical or scanned sources.

622 stars. No commits in the last 6 months.

Use this if you need to extract text from images, scans, or photos and are frequently frustrated by errors, especially with non-English content or noisy documents.

Not ideal if you require real-time, high-volume text extraction where latency is critical, as combining multiple engines and an LLM can take more time.

document-digitization data-entry-automation content-extraction language-processing image-to-text

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

622

Forks

Language

Python

License

MIT

Higher-rated alternatives

NanoNets/docstrange

Extract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple...

th1nhhdk/local_ai_ocr

An local, offline (after initial setup), portable OCR software that can process images and PDF...

Dicklesworthstone/llm_aided_ocr

Enhances Tesseract OCR output using LLMs (local or API) for error correction, smart chunking,...

emcf/thepipe

Get clean data from tricky documents, powered by vision-language models ⚡

langstruct-ai/langstruct

Extract structured data from any content using LLMs.

Explore LLM Tools

All categories Trending LLM Tool directory Insights