butlerlabs/docai
DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning models for a wide range of applications
This is a Python library that helps developers build automated workflows for processing documents, images, and text. You feed it files like JPEGs, PNGs, or PDFs, and it uses machine learning to extract specific information. It's designed for developers who need to integrate advanced document intelligence into their applications.
No commits in the last 6 months.
Use this if you are a developer building applications that need to automatically process and extract data from various document types.
Not ideal if you are an end-user looking for a ready-to-use application, as this is a developer tool requiring coding.
Stars
21
Forks
1
Language
Python
License
Apache-2.0
Category
Last pushed
Dec 09, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/butlerlabs/docai"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
paperless-ngx/paperless-ngx
A community-supported supercharged document management system: scan, index and archive all your documents
GoogleCloudPlatform/document-ai-samples
Sample applications and demos for Document AI, the end-to-end document processing platform on...
aws-solutions/document-understanding-solution
Example of integrating & using Amazon Textract, Amazon Comprehend, Amazon Comprehend Medical,...
naiveHobo/InvoiceNet
Deep neural network to extract intelligent information from invoice documents.
aphp/edspdf
EDS-PDF is a generic, pure-Python framework for text extraction from PDF documents. It provides...