Arterning/DeepParseX

DeepParseX 是一个强大的多模态文档解析与知识管理平台，支持 PDF、Word、Excel、PPT、图片、视频、音频等多种文件格式的智能解析，自动提取关键信息，并构建检索增强生成（RAG）和知识图谱（Knowledge Graph）系统，实现结构化数据的智能检索与推理。

/ 100

Established

This helps organizations and individuals manage vast amounts of information by automatically processing various file types like PDFs, Word documents, Excel spreadsheets, images, videos, and audio. It extracts key facts and relationships, organizing them into a searchable knowledge base. Anyone who needs to make sense of large collections of documents and media, such as knowledge managers, researchers, or data analysts, can use this.

Use this if you need to automatically extract, organize, and query information from a diverse set of unstructured and semi-structured documents and media files.

Not ideal if your data is already highly structured in traditional databases and you don't need to process text, images, or audio for information extraction.

knowledge-management document-intelligence information-extraction data-analysis research-support

No Package No Dependents

Maintenance 10 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Related tools

thiswillbeyourgithub/wdoc

Summarize and query from a lot of heterogeneous documents. Any LLM provider, any filetype,...

NoEdgeAI/pdfdeal

A python wrapper for the Doc2X API and comes with native texts processing (to improve PDF recall...

laxmimerit/RAGWire

Production-grade RAG toolkit — ingest PDFs, DOCX, XLSX into Qdrant with LLM metadata extraction,...

David-Lolly/ViewRAG

图文并茂的 PDF RAG 系统：支持版式感知分块、图表深度理解与精准视觉溯源。 Multimodal PDF RAG: Features layout-aware chunking,...

atpuxiner/docsloader

This is a documents loader. (文档解析加载器，rag文档解析，rag知识库构建)

Explore RAG Tools

All categories Trending RAG directory Insights