Ingestion RAG Tools
There are 50 ingestion tools tracked. The highest-rated is veyliss/ai-localbase at 48/100 with 133 stars.
Get all 50 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=rag&subcategory=ingestion&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
veyliss/ai-localbase
一个本地优先的AI知识库系统(RAG),用于把本地文档接入辅导搜索与大模型对话流程。目前支持md、txt、pdf(文本)类型 |
|
Emerging |
| 2 |
Cloud2BR-MSFTLearningHub/RAG-ChatBot-Implementation
This repository contains example of a RAG chat bot with a basic architecture... |
|
Emerging |
| 3 |
LakshmiSravyaVedantham/RAG-Based-Chatbot-with-Streamlit
Chat with any document (PDF, CSV, DOCX) using RAG — LangChain + Streamlit + OpenAI |
|
Emerging |
| 4 |
yifanfeng97/Hyper-Extract
Transform unstructured text into structured knowledge with LLMs. Graphs,... |
|
Emerging |
| 5 |
Tendo33/markio
a powerful document processing service that seamlessly converts a wide range... |
|
Experimental |
| 6 |
LLMSystems/file2md
file2md is a versatile tool for converting multiple file formats to Markdown. |
|
Experimental |
| 7 |
brolyroly007/docschat
RAG chat system: ingest documents, embed in ChromaDB, and chat with any LLM |
|
Experimental |
| 8 |
trenknerpeter/mdspin
Document to Markdown converter for AI workflows — try it at https://mdspin.app |
|
Experimental |
| 9 |
tushar10sh/NimbusPDF
Private, self-hosted PDF reader with an offline-capable AI assistant. 🔒 100%... |
|
Experimental |
| 10 |
however-yir/ai-demo
Spring AI demo backend with chat, tool calling, multimodal input, PDF RAG,... |
|
Experimental |
| 11 |
lakshgk/distill
Python library that converts Word, Excel, PowerPoint, PDF, and Google Docs... |
|
Experimental |
| 12 |
harshbhanushali26/hArI
AI-powered PDF & CSV analysis assistant using Groq LLM, ChromaDB, and RAG... |
|
Experimental |
| 13 |
PatienceQi/sge_lightrag
SGE: Structure-Guided Extraction for GraphRAG — faithful graph construction... |
|
Experimental |
| 14 |
mulkatz/mulder
Config-driven Document Intelligence Platform on GCP. PDFs → Knowledge Graph,... |
|
Experimental |
| 15 |
VesperArch/rag-ingestion-benchmark
Benchmark: GopherDoc (Go) vs LangChain (Python) — 340× throughput, 3.3× less... |
|
Experimental |
| 16 |
Nufeen/pdf-rag
Local RAG over pdf collection |
|
Experimental |
| 17 |
vericontext/parsemux
Document parser orchestrator — auto-routes to the optimal OSS parser. CLI,... |
|
Experimental |
| 18 |
Cdharth-07/AI-Powered-Travel-Language-Companion-App
A multimodal AI travel companion built with Streamlit. Features an LLM... |
|
Experimental |
| 19 |
nithinrajkore/PDF-DataAnalyzer
RAG-based PDF QA app using Google Gemini, LangChain, FAISS, and Streamlit —... |
|
Experimental |
| 20 |
Sanya003/Scribe
I read your PDFs so you don’t have to. 👀 |
|
Experimental |
| 21 |
a2Fsa2k/eigen
ms edge pdf viewer but simply superior |
|
Experimental |
| 22 |
CHIRABRATA/vagacore
VagaCore — Context-aware NLP engine for extracting structured, time-aware... |
|
Experimental |
| 23 |
sniperx-19/rag-chatbot
Chat with multiple PDFs locally |
|
Experimental |
| 24 |
shivaacodes/document-rag-service
FastAPI RAG microservice for document ingestion and contextual content... |
|
Experimental |
| 25 |
jmatias2411/RAG
🧠 Consulta tus PDFs con IA local usando LangChain, Ollama y Streamlit. Sube... |
|
Experimental |
| 26 |
DhruvShah510/ai-meeting-assistant
AI-powered meeting assistant that summarizes transcripts, extracts action... |
|
Experimental |
| 27 |
drewid74/ai_skills
AI skills and workflow templates for Claude Code, Copilot, Gemini, any AI... |
|
Experimental |
| 28 |
ashwyan/Privacy-First-Local-RAG-Pipeline
A local AI tool using Ollama (Llama 3) to analyze PDF documents and generate... |
|
Experimental |
| 29 |
leosantos2003/Sabia-QA-System-on-Scientific-Articles
Question-Answer RAG-based system with Sabiá on scientific articles in PDF format. |
|
Experimental |
| 30 |
sam-k0/ExamGen
Generate exam questions based on slides, notes or other PDFs. Answer and... |
|
Experimental |
| 31 |
reezuleanu/pdf_deconstructor
Decompose a PDF file based on its headers for RAG ingestion. |
|
Experimental |
| 32 |
DevPedroGomes/voice_rag
Voice RAG — Upload PDFs and ask questions with voice-powered answers.... |
|
Experimental |
| 33 |
Ashwathama2024/manual-diagnostic-ai
Offline AI diagnostic assistant for marine/industrial equipment — Upload PDF... |
|
Experimental |
| 34 |
MsheesAI/CortexDocs
A smart PDF summarization tool built with AI to convert large documents into... |
|
Experimental |
| 35 |
am2998/RAG-cli
Local-first RAG CLI that ingests documents, stores embeddings in Qdrant, and... |
|
Experimental |
| 36 |
AnshumanMahanta/Cyra-Analytics
Cyra Analytics is RAG-based CSV Analyzer for automated dataset profiling and... |
|
Experimental |
| 37 |
seantlee88/ai-operations-copilot
AI document assistant that summarizes estimates, extracts costs, timelines,... |
|
Experimental |
| 38 |
wahhabriaz/rag-chat-pro
RAG chatbot for PDF Q&A with switchable AI providers and Streamlit UI |
|
Experimental |
| 39 |
Rehman110-F/docmind
Full-stack RAG application — chat with your PDF documents using Google... |
|
Experimental |
| 40 |
willweimike/RAGAgent
Agentic PDF RAG with LangGraph & Ollama |
|
Experimental |
| 41 |
N3M3515069/rag-knowledge-assistant
A RAG-based Q&A assistant that answers questions from uploaded PDFs using... |
|
Experimental |
| 42 |
Yashwanth-23/Omnisense
Local multimodal RAG AI assistant where you can chat with PDFs, images, and... |
|
Experimental |
| 43 |
Anshuljain-bit/pdf-chatbot
Agentic PDF and document-image chatbot with grounded RAG, citations,... |
|
Experimental |
| 44 |
rohanpatil2905/Personal-AI-Assistant
AI-Powered PDF assistant using RAG + Gemini API |
|
Experimental |
| 45 |
mayurk224/mindfolio
Mindfolio is an AI-powered knowledge management system and "Second Brain"... |
|
Experimental |
| 46 |
deBUGger404/navexa-docs
Navexa Docs — documentation site for the Navexa PDF/document processing... |
|
Experimental |
| 47 |
Hema4640/AI-Document-Assistant
AI-powered PDF Question Answering System using RAG, LangChain, and ChromaDB |
|
Experimental |
| 48 |
aseseri/agent-society-user-simulation
LLM-based User Simulation Agent for the AgentSociety Challenge. Features... |
|
Experimental |
| 49 |
MekdelawitGebre/student-notes-rag
RAG application that lets students upload PDF notes and ask questions using... |
|
Experimental |
| 50 |
EtheXReal/basiclaw-rag
RAG demo that turns the Hong Kong Basic Law PDF into a FAISS + Redis... |
|
Experimental |