Ingestion RAG Tools

There are 50 ingestion tools tracked. The highest-rated is veyliss/ai-localbase at 48/100 with 133 stars.

Get all 50 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=rag&subcategory=ingestion&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 veyliss/ai-localbase

一个本地优先的AI知识库系统(RAG),用于把本地文档接入辅导搜索与大模型对话流程。目前支持md、txt、pdf(文本)类型

48
Emerging
2 Cloud2BR-MSFTLearningHub/RAG-ChatBot-Implementation

This repository contains example of a RAG chat bot with a basic architecture...

45
Emerging
3 LakshmiSravyaVedantham/RAG-Based-Chatbot-with-Streamlit

Chat with any document (PDF, CSV, DOCX) using RAG — LangChain + Streamlit + OpenAI

39
Emerging
4 yifanfeng97/Hyper-Extract

Transform unstructured text into structured knowledge with LLMs. Graphs,...

30
Emerging
5 Tendo33/markio

a powerful document processing service that seamlessly converts a wide range...

28
Experimental
6 LLMSystems/file2md

file2md is a versatile tool for converting multiple file formats to Markdown.

28
Experimental
7 brolyroly007/docschat

RAG chat system: ingest documents, embed in ChromaDB, and chat with any LLM

25
Experimental
8 trenknerpeter/mdspin

Document to Markdown converter for AI workflows — try it at https://mdspin.app

24
Experimental
9 tushar10sh/NimbusPDF

Private, self-hosted PDF reader with an offline-capable AI assistant. 🔒 100%...

24
Experimental
10 however-yir/ai-demo

Spring AI demo backend with chat, tool calling, multimodal input, PDF RAG,...

23
Experimental
11 lakshgk/distill

Python library that converts Word, Excel, PowerPoint, PDF, and Google Docs...

23
Experimental
12 harshbhanushali26/hArI

AI-powered PDF & CSV analysis assistant using Groq LLM, ChromaDB, and RAG...

22
Experimental
13 PatienceQi/sge_lightrag

SGE: Structure-Guided Extraction for GraphRAG — faithful graph construction...

22
Experimental
14 mulkatz/mulder

Config-driven Document Intelligence Platform on GCP. PDFs → Knowledge Graph,...

22
Experimental
15 VesperArch/rag-ingestion-benchmark

Benchmark: GopherDoc (Go) vs LangChain (Python) — 340× throughput, 3.3× less...

22
Experimental
16 Nufeen/pdf-rag

Local RAG over pdf collection

22
Experimental
17 vericontext/parsemux

Document parser orchestrator — auto-routes to the optimal OSS parser. CLI,...

22
Experimental
18 Cdharth-07/AI-Powered-Travel-Language-Companion-App

A multimodal AI travel companion built with Streamlit. Features an LLM...

22
Experimental
19 nithinrajkore/PDF-DataAnalyzer

RAG-based PDF QA app using Google Gemini, LangChain, FAISS, and Streamlit —...

22
Experimental
20 Sanya003/Scribe

I read your PDFs so you don’t have to. 👀

21
Experimental
21 a2Fsa2k/eigen

ms edge pdf viewer but simply superior

19
Experimental
22 CHIRABRATA/vagacore

VagaCore — Context-aware NLP engine for extracting structured, time-aware...

19
Experimental
23 sniperx-19/rag-chatbot

Chat with multiple PDFs locally

19
Experimental
24 shivaacodes/document-rag-service

FastAPI RAG microservice for document ingestion and contextual content...

19
Experimental
25 jmatias2411/RAG

🧠 Consulta tus PDFs con IA local usando LangChain, Ollama y Streamlit. Sube...

19
Experimental
26 DhruvShah510/ai-meeting-assistant

AI-powered meeting assistant that summarizes transcripts, extracts action...

19
Experimental
27 drewid74/ai_skills

AI skills and workflow templates for Claude Code, Copilot, Gemini, any AI...

18
Experimental
28 ashwyan/Privacy-First-Local-RAG-Pipeline

A local AI tool using Ollama (Llama 3) to analyze PDF documents and generate...

17
Experimental
29 leosantos2003/Sabia-QA-System-on-Scientific-Articles

Question-Answer RAG-based system with Sabiá on scientific articles in PDF format.

17
Experimental
30 sam-k0/ExamGen

Generate exam questions based on slides, notes or other PDFs. Answer and...

17
Experimental
31 reezuleanu/pdf_deconstructor

Decompose a PDF file based on its headers for RAG ingestion.

17
Experimental
32 DevPedroGomes/voice_rag

Voice RAG — Upload PDFs and ask questions with voice-powered answers....

16
Experimental
33 Ashwathama2024/manual-diagnostic-ai

Offline AI diagnostic assistant for marine/industrial equipment — Upload PDF...

16
Experimental
34 MsheesAI/CortexDocs

A smart PDF summarization tool built with AI to convert large documents into...

16
Experimental
35 am2998/RAG-cli

Local-first RAG CLI that ingests documents, stores embeddings in Qdrant, and...

15
Experimental
36 AnshumanMahanta/Cyra-Analytics

Cyra Analytics is RAG-based CSV Analyzer for automated dataset profiling and...

15
Experimental
37 seantlee88/ai-operations-copilot

AI document assistant that summarizes estimates, extracts costs, timelines,...

14
Experimental
38 wahhabriaz/rag-chat-pro

RAG chatbot for PDF Q&A with switchable AI providers and Streamlit UI

14
Experimental
39 Rehman110-F/docmind

Full-stack RAG application — chat with your PDF documents using Google...

14
Experimental
40 willweimike/RAGAgent

Agentic PDF RAG with LangGraph & Ollama

14
Experimental
41 N3M3515069/rag-knowledge-assistant

A RAG-based Q&A assistant that answers questions from uploaded PDFs using...

14
Experimental
42 Yashwanth-23/Omnisense

Local multimodal RAG AI assistant where you can chat with PDFs, images, and...

14
Experimental
43 Anshuljain-bit/pdf-chatbot

Agentic PDF and document-image chatbot with grounded RAG, citations,...

14
Experimental
44 rohanpatil2905/Personal-AI-Assistant

AI-Powered PDF assistant using RAG + Gemini API

14
Experimental
45 mayurk224/mindfolio

Mindfolio is an AI-powered knowledge management system and "Second Brain"...

14
Experimental
46 deBUGger404/navexa-docs

Navexa Docs — documentation site for the Navexa PDF/document processing...

13
Experimental
47 Hema4640/AI-Document-Assistant

AI-powered PDF Question Answering System using RAG, LangChain, and ChromaDB

13
Experimental
48 aseseri/agent-society-user-simulation

LLM-based User Simulation Agent for the AgentSociety Challenge. Features...

13
Experimental
49 MekdelawitGebre/student-notes-rag

RAG application that lets students upload PDF notes and ask questions using...

13
Experimental
50 EtheXReal/basiclaw-rag

RAG demo that turns the Hong Kong Basic Law PDF into a FAISS + Redis...

13
Experimental