maxent-ai/ocrpy

OCR, Archive, Index and Search: Implementation agnostic OCR framework.

/ 100

Emerging

This tool helps you convert scanned documents or images containing text into searchable and editable text files, regardless of whether they are stored locally or in cloud services like AWS or Google Cloud. It takes various document types as input and provides indexed, searchable text as output. This is ideal for anyone managing large volumes of documents, such as a records manager, legal professional, or data entry specialist.

224 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need a straightforward way to extract text from images and documents using different OCR technologies without having to learn each one's specific interface.

Not ideal if you only occasionally process a few documents or prefer to manually type out text rather than automate the process.

document-management data-entry information-retrieval digital-archiving text-extraction

Stale 6m

Maintenance 0 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 9 / 25

How are scores calculated?

Stars

224

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

haven-jeon/LegalQA

Korean LegalQA using SentenceKoBART

ametnes/nesis

Your AI Powered Enterprise Knowledge Partner. Designed to be used at scale from ingesting large...

foxminchan/LawKnowledge

A legal knowledge search and Q&A application based on Vietnam's Legal Code and legal document database ⚖️

intel/document-automation

Document Automation Reference Kit

machinelearningZH/document-research-tool

Perform intelligent research over document collections using hybrid search and LLMs.

Explore Embedding Tools

All categories Trending Embeddings directory Insights