gptscript-ai/gptparse

Document parser for RAG

39
/ 100
Emerging

This tool helps you convert complex PDFs and image files into structured Markdown. It takes your documents or images and produces well-formatted Markdown text, including tables, lists, and embedded images, making it easy to integrate into text-based applications. Anyone building systems that need to process and understand document content from PDFs or images, such as for intelligent search or content analysis, would find this useful.

No commits in the last 6 months. Available on PyPI.

Use this if you need to extract structured text from a variety of document types, including images and PDFs with complex layouts, for use in text-based workflows or AI systems.

Not ideal if you only need simple text extraction from basic, text-only documents without any complex formatting or embedded visuals.

document-processing content-extraction information-retrieval AI-data-preparation knowledge-management
Stale 6m
Maintenance 0 / 25
Adoption 7 / 25
Maturity 25 / 25
Community 7 / 25

How are scores calculated?

Stars

28

Forks

2

Language

Python

License

Apache-2.0

Last pushed

Nov 13, 2024

Commits (30d)

0

Dependencies

36

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/gptscript-ai/gptparse"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.