Document Data Extraction Data Engineering Tools
There are 4 document data extraction tools tracked. 1 score above 70 (verified tier). The highest-rated is Unstructured-IO/unstructured at 79/100 with 14,211 stars. 1 of the top 10 are actively maintained.
Get all 4 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=data-engineering&subcategory=document-data-extraction&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
Unstructured-IO/unstructured
Convert documents to structured data effortlessly. Unstructured is... |
|
Verified |
| 2 |
ThePagePage/docschema
Document schema extraction framework for regulated industries. Parse complex... |
|
Experimental |
| 3 |
amikrsin/StatementSync-Lite
StatementSync is a lightweight, high-performance Progressive Web App (PWA)... |
|
Experimental |
| 4 |
obieg-zero/plugin-wibor-docs
OCR, ekstrakcja danych z umow, Q&A o kontrakcie |
|
Experimental |