jacobmarks/pytesseract-ocr-plugin
Run optical character recognition with PyTesseract from the FiftyOne App!
This tool helps you extract text from images of documents and other visual media. It takes image files containing text and converts them into searchable text data, identifying individual words and text blocks. Anyone working with large collections of scanned documents, forms, or images needing text extraction, like data entry specialists or archivists, would find this useful.
No commits in the last 6 months.
Use this if you need to quickly get machine-readable text from image-based documents to make them searchable or analyzable.
Not ideal if you require extremely high accuracy for handwriting recognition or highly complex, low-quality document scans.
Stars
11
Forks
—
Language
Python
License
—
Category
Last pushed
Apr 05, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/jacobmarks/pytesseract-ocr-plugin"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
deepdoctection/deepdoctection
A Repo For Document AI
deanmalmgren/textract
extract text from any document. no muss. no fuss.
eikek/docspell
Assist in organizing your piles of documents, resulting from scanners, e-mails and other sources...
zzzDavid/ICDAR-2019-SROIE
ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction
clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic...