gptscript-ai/gptparse
Document parser for RAG
This tool helps you convert complex PDFs and image files into structured Markdown. It takes your documents or images and produces well-formatted Markdown text, including tables, lists, and embedded images, making it easy to integrate into text-based applications. Anyone building systems that need to process and understand document content from PDFs or images, such as for intelligent search or content analysis, would find this useful.
No commits in the last 6 months. Available on PyPI.
Use this if you need to extract structured text from a variety of document types, including images and PDFs with complex layouts, for use in text-based workflows or AI systems.
Not ideal if you only need simple text extraction from basic, text-only documents without any complex formatting or embedded visuals.
Stars
28
Forks
2
Language
Python
License
Apache-2.0
Category
Last pushed
Nov 13, 2024
Commits (30d)
0
Dependencies
36
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/gptscript-ai/gptparse"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
gpt-open/rag-gpt
RAG-GPT, leveraging LLM and RAG technology, learns from user-customized knowledge bases to...
LexiestLeszek/scrapeGPT
ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer...
leon0204/fast-rag
LLM Rag Intelligent Q&A Robot
maanvithag/thinkai
An LLM app with Retrieval Augmented Generation (RAG) built using OpenAI GPT models, Langchain...
PatentTRIZbasedAI20260226110030/Patent-GPT
Patent-GPT is an Agentic RAG-based invention copilot combining TRIZ methodology with LLMs. It...