Document Data Extraction Generative AI Tools
Tools for extracting, parsing, and structuring data from documents (PDFs, images, business cards, invoices, tenders) using OCR and AI. Includes document intelligence, tabular data extraction, and field recognition. Does NOT include document summarization, general document Q&A without structured extraction, or legal/thematic document analysis.
There are 42 document data extraction tools tracked. The highest-rated is gmp007/PropertyExtractor at 38/100 with 13 stars.
Get all 42 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=generative-ai&subcategory=document-data-extraction&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
gmp007/PropertyExtractor
Generative AI-based Software for Material Property and Database Generation |
|
Emerging |
| 2 |
john-ng-hk/Biz-card-scanner
A digital repository for your physical business cards |
|
Emerging |
| 3 |
AdritPal08/universal-web-scraper-using-generative-ai
Effortless Data Extraction, Powered by : Generative AI |
|
Emerging |
| 4 |
jWinman91/AI-OCR
An AI-powered, but model-agnostic (Optical-Character-Recognition) OCR tool |
|
Emerging |
| 5 |
ryanmcdonough/lexplore
Tool to allow extraction of data from legal documents |
|
Emerging |
| 6 |
jWinman91/AI-OCR-Frontend
An AI-powered, but model-agnostic (Optical-Character-Recognition) OCR tool (frontend) |
|
Emerging |
| 7 |
thehackersplaybook/thp-ocr
THP-OCR: A simple Gen AI-powered OCR tool. 🍁 |
|
Emerging |
| 8 |
100ravSingh/ChequeScan
My Gen AI deployment |
|
Experimental |
| 9 |
kaifcoder/Invoice-Query-Tool-using-gemini-ai
This repository contains a Python project that leverages the Gemini Pro... |
|
Experimental |
| 10 |
codedbyasim/Generative-AI-Document-Intelligence-System
Extract and summarise data from PDFs and images using OCR + LLMs. Built with... |
|
Experimental |
| 11 |
viochris/Streamlit-SpendSense
💸 SpendSense: An AI-powered personal finance tracker built with Streamlit.... |
|
Experimental |
| 12 |
bejranonda/MeterVision
👁️ MeterVision: Enterprise-grade meter infrastructure management with a... |
|
Experimental |
| 13 |
law4percent/CheckMe
CheckMe eliminates manual paper checking by using a flatbed scanner,... |
|
Experimental |
| 14 |
Akhand-Pratap-Tiwari/Cyber-Alertz-web-scrapping-microservice
Flask app for scraping cybersecurity website and purify the raw content... |
|
Experimental |
| 15 |
artyuan/smart-receipt-assistant
Reads market invoices to extract and analyze spending data. Tracks prices of... |
|
Experimental |
| 16 |
kmaurinjones/Housing-Law-Insight
Web application designed to showcase the potential of Data Science and... |
|
Experimental |
| 17 |
MasterChief-ai/AI-Dataset-Analysis-Tool
An AI-powered dataset analysis tool that automatically classifies tasks... |
|
Experimental |
| 18 |
dvp-git/gemini-information-extractor
A simple single interface information extractor app using the latest... |
|
Experimental |
| 19 |
jagratadeb/GenAI-UiPath-TextExtractor
UiPath automation using OCR and GenAI to extract key data from scanned... |
|
Experimental |
| 20 |
codeterrayt/Scalable-Genai-Invoice-PDF-Data-Extractor
Scalable GenAI-powered system to extract structured invoice data from PDFs &... |
|
Experimental |
| 21 |
Wilson0406/Self-Improving-LLM-Agent
A dual-agent, feedback-driven document extraction system using GPT-5 and... |
|
Experimental |
| 22 |
Naresh1401/Intelligent-document-processing
LLM-powered document processing: extract structured data from invoices,... |
|
Experimental |
| 23 |
Anthtrax/AIcheck
📸 Streamline your study process with AIcheck, a quick job-checking tool that... |
|
Experimental |
| 24 |
Magenta91/test101
A web application that extracts text from PDF files, processes it using... |
|
Experimental |
| 25 |
Suriya-Prakashar/AI-driven-tender-scrutiny-system-for-NLCI
AI-powered system for NLC India Limited to automate tender scrutiny. Uses... |
|
Experimental |
| 26 |
Chaitanyakrishna294/Myntra_Genai
myntra reveiw analysis using genai |
|
Experimental |
| 27 |
Phoenixcoder-6/po-automation
This project automates the extraction, parsing, and structuring of purchase... |
|
Experimental |
| 28 |
0ameyasr/DocVal-Mini
Insurance Document Validation with Gemini AI + FastAPI |
|
Experimental |
| 29 |
Debjyoti2004/PhotoCheck-AI
An intelligent web application that instantly verifies if a passport photo... |
|
Experimental |
| 30 |
het953/AI-Web-Scraper
An intelligent web scraping tool built with Streamlit, Selenium, and... |
|
Experimental |
| 31 |
dhcgn/anthropic-paperless-ngx-ocr
AnthropicPaperOCR is a CLI tool that extracts text from PDFs using advanced... |
|
Experimental |
| 32 |
vedant-kalal/AI-Visiting-Card-Extractor
An AI-powered tool that instantly converts business cards into actionable... |
|
Experimental |
| 33 |
ReNothingg/WBcheker
Проект для анализа отзывов на товары с Wildberries с использованием Gemini... |
|
Experimental |
| 34 |
CyranoB/claim_analysis
This project provides a tool to analyze claims made in a webpage or... |
|
Experimental |
| 35 |
RajhansJain/MULTI-LANGUAGE-INVOICE-EXTRACTOR-LLM
AI-powered invoice understanding system using Vision + LLMs (Gemini API).... |
|
Experimental |
| 36 |
etrotta/gemini_easy_extractor
Automatically extract formatted data out of text documents |
|
Experimental |
| 37 |
Siva-Dev-001/Invoice-Pro-using-LLM
A multi-language invoice extractor using Streamlit and LLM |
|
Experimental |
| 38 |
usrtem/ResearchAI
AI-powered document analysis tool for querying content across PDFs, Word... |
|
Experimental |
| 39 |
MITTALBHAVYA/InvoiceDetailsExtractor
Invoice Extraction Application is a Python-based tool built with Streamlit... |
|
Experimental |
| 40 |
TheOwner-glitch/oracle_hcm_metadata_extractor
Python-based command-line tool that extracts Oracle's publicly available HCM... |
|
Experimental |
| 41 |
Rachana-Baldania/multilingual_invoice_extractor_google_gemini
multilingual_invoice_extractor_google_gemini-master |
|
Experimental |
| 42 |
AjayMaan13/smart-script-analyzer
A Streamlit-based AI tool that uses GPT-4 Vision to extract items, totals,... |
|
Experimental |