jacobmarks/pytesseract-ocr-plugin

Run optical character recognition with PyTesseract from the FiftyOne App!

/ 100

Experimental

This tool helps you extract text from images of documents and other visual media. It takes image files containing text and converts them into searchable text data, identifying individual words and text blocks. Anyone working with large collections of scanned documents, forms, or images needing text extraction, like data entry specialists or archivists, would find this useful.

No commits in the last 6 months.

Use this if you need to quickly get machine-readable text from image-based documents to make them searchable or analyzable.

Not ideal if you require extremely high accuracy for handwriting recognition or highly complex, low-quality document scans.

document-processing data-extraction content-digitization archiving information-retrieval

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

deepdoctection/deepdoctection

A Repo For Document AI

deanmalmgren/textract

extract text from any document. no muss. no fuss.

eikek/docspell

Assist in organizing your piles of documents, resulting from scanners, e-mails and other sources...

zzzDavid/ICDAR-2019-SROIE

ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction

clovaai/donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic...

Explore NLP Tools

All categories Trending NLP directory Insights