parsee-ai/parsee-core

Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and HTML files. Extensive support of tabular data extraction and multimodal queries.

35
/ 100
Emerging

This project helps financial analysts, data entry specialists, and operations managers automatically extract specific information from unstructured documents like PDFs, HTML files, and images. You input these documents, define what data points you need (like an invoice total and its currency), and it outputs that data in a structured, usable format. It's especially useful for handling financial documents.

Use this if you regularly need to pull specific pieces of information, especially from tables within financial PDFs or HTML files, and want to automate this process to get structured data.

Not ideal if your primary goal is general text summarization or if your data sources are exclusively plain text without any complex structuring or tables.

financial-data-extraction document-processing invoice-automation data-structuring regulatory-compliance
No Package No Dependents
Maintenance 6 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 4 / 25

How are scores calculated?

Stars

83

Forks

2

Language

Python

License

MIT

Last pushed

Jan 07, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/parsee-ai/parsee-core"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.