AdemBoukhris457/Documents-Parsing-Lab
Jupyter notebooks testing different OCR models for document parsing (Dolphin, MonkeyOCR, Marker, Nanonets, ...)
This project helps you automate the extraction of key information from various documents like PDFs, scanned images, and more. You provide the documents, and it gives you structured text, tables, and even data from charts, ready for analysis or database entry. This is ideal for data entry clerks, compliance officers, researchers, or anyone dealing with large volumes of documents that need to be digitized and analyzed efficiently.
Use this if you need to automatically pull specific data, tables, or charts from digital or scanned documents and process them systematically.
Not ideal if you only need simple text conversion without needing to understand document structure or extract specific data fields.
Stars
78
Forks
9
Language
Jupyter Notebook
License
—
Category
Last pushed
Nov 01, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/AdemBoukhris457/Documents-Parsing-Lab"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
jupyterlab/jupyter-ai
A generative AI extension for JupyterLab
aws-samples/generative-ai-ml-latam-samples
This repo provides Generative AI and AI/ML code samples, blueprints (end-to-end solutions) and...
dkanungo/Probabilistic-ML-for-finance-and-investing
Probabilistic Machine Learning for Finance and Investing: A Primer to Generative AI with Python
morganstanley/MSML
Repo for Morgan Stanley Machine Learning Research group's publications
Yash-Kavaiya/GenAI-Learning
Up-to-Date Content: We regularly update our repository with new courses, articles, and tutorials...