Arterning/DeepParseX
DeepParseX 是一个强大的多模态文档解析与知识管理平台,支持 PDF、Word、Excel、PPT、图片、视频、音频 等多种文件格式的智能解析,自动提取关键信息,并构建 检索增强生成(RAG) 和 知识图谱(Knowledge Graph) 系统,实现结构化数据的智能检索与推理。
This helps organizations and individuals manage vast amounts of information by automatically processing various file types like PDFs, Word documents, Excel spreadsheets, images, videos, and audio. It extracts key facts and relationships, organizing them into a searchable knowledge base. Anyone who needs to make sense of large collections of documents and media, such as knowledge managers, researchers, or data analysts, can use this.
Use this if you need to automatically extract, organize, and query information from a diverse set of unstructured and semi-structured documents and media files.
Not ideal if your data is already highly structured in traditional databases and you don't need to process text, images, or audio for information extraction.
Stars
56
Forks
11
Language
Python
License
MIT
Category
Last pushed
Feb 21, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/Arterning/DeepParseX"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
thiswillbeyourgithub/wdoc
Summarize and query from a lot of heterogeneous documents. Any LLM provider, any filetype,...
NoEdgeAI/pdfdeal
A python wrapper for the Doc2X API and comes with native texts processing (to improve PDF recall...
laxmimerit/RAGWire
Production-grade RAG toolkit — ingest PDFs, DOCX, XLSX into Qdrant with LLM metadata extraction,...
David-Lolly/ViewRAG
图文并茂的 PDF RAG 系统:支持版式感知分块、图表深度理解与精准视觉溯源。 Multimodal PDF RAG: Features layout-aware chunking,...
atpuxiner/docsloader
This is a documents loader. (文档解析加载器,rag文档解析,rag知识库构建)