Document Intelligence RAG Embedding Tools
Tools for uploading, searching, and conversationally querying documents (PDFs, files, etc.) using embeddings and semantic search to extract insights and answers. Does NOT include code documentation generation, code search, or cross-document fact-checking systems.
There are 47 document intelligence rag tools tracked. The highest-rated is haven-jeon/LegalQA at 45/100 with 97 stars.
Get all 47 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=embeddings&subcategory=document-intelligence-rag&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
haven-jeon/LegalQA
Korean LegalQA using SentenceKoBART |
|
Emerging |
| 2 |
maxent-ai/ocrpy
OCR, Archive, Index and Search: Implementation agnostic OCR framework. |
|
Emerging |
| 3 |
ametnes/nesis
Your AI Powered Enterprise Knowledge Partner. Designed to be used at scale... |
|
Emerging |
| 4 |
foxminchan/LawKnowledge
A legal knowledge search and Q&A application based on Vietnam's Legal Code... |
|
Emerging |
| 5 |
intel/document-automation
Document Automation Reference Kit |
|
Emerging |
| 6 |
machinelearningZH/document-research-tool
Perform intelligent research over document collections using hybrid search and LLMs. |
|
Emerging |
| 7 |
utachicodes/PyDocEnhancer
An AI-powered Python plugin to enhance documentation with summaries, code... |
|
Emerging |
| 8 |
Schematise-Lex-Data-Analysis/lex-liberalis
A fork of Semantra for Indian court judgments |
|
Emerging |
| 9 |
ryanlane/document-manager
Local-first document archive assistant for semantic search and RAG using... |
|
Emerging |
| 10 |
joe32140/tei-qdrant-cache
Docker Compose stack for scalable TEI embeddings (multi-GPU) fronted by a... |
|
Emerging |
| 11 |
FellowTraveler/ngest
Python script for ingesting various files into a semantic graph. For text,... |
|
Emerging |
| 12 |
kchanda24/hackathon-backend
Enterprise Content Management MVP with semantic search capabilities. Upload... |
|
Emerging |
| 13 |
Leg0shii/smart-documents
A web application that enables users to upload documents and utilize AI... |
|
Emerging |
| 14 |
mcplusa/elastic-ingest-http
This is an Elasticsearch Ingest Pipeline Processor that calls an HTTP(s)... |
|
Emerging |
| 15 |
josego85/pdf-content-search
🔍 AI-powered PDF search with OCR support for scanned documents, local AI via... |
|
Experimental |
| 16 |
VedantKothari01/DocInsight
AI-powered document originality and plagiarism risk detection system... |
|
Experimental |
| 17 |
moonlitrevery/DodocLens
Inteligência documental com IA local (OCR + busca semântica) para PDFs e... |
|
Experimental |
| 18 |
HarshilMaks/InsightDocs
AI Document Intelligence System for deep analysis and semantic querying of... |
|
Experimental |
| 19 |
HemalDholakiya12/PDFChat
A web app that allows users to upload PDFs and interact with them through a... |
|
Experimental |
| 20 |
harshsrivastava05/Document-Analyzer
An AI-powered document analysis platform that transforms uploaded files into... |
|
Experimental |
| 21 |
gracee3/qdrant-bge-stack
Local deployment stack for Qdrant vector search with vLLM-served BAAI... |
|
Experimental |
| 22 |
mry0tt4/DocGenie
AI-powered documentation platform that automatically generates, categorizes,... |
|
Experimental |
| 23 |
xhulianokoci/DocCompareAI
ASP.NET Core API for comparing Word documents with AI — text diff, OpenAI... |
|
Experimental |
| 24 |
danilagoleen/vetka-ingest-engine
Ingestion/indexing core for agent systems: scanning, extraction, dependency... |
|
Experimental |
| 25 |
Tonemon/StaxRead
Self-hosted semantic search over your own documents. Your own self-hosted... |
|
Experimental |
| 26 |
LeonKiptoo/document-intelligence-engine
A document intelligence system that enables semantic question answering over... |
|
Experimental |
| 27 |
ashankgupta/docai
DocAI is a Go-based toolkit that enables intelligent interaction with your... |
|
Experimental |
| 28 |
ventz/pdf-semantic-keyword-analysis
High-performance PDF Semantic keyword analysis tool using AI for intelligent... |
|
Experimental |
| 29 |
JacobPolloreno/OfficeAnswers
Get to the real work by using neural information retrieval for company information. |
|
Experimental |
| 30 |
KaramelBytes/docloom-cli
AI‑augmented document analysis and lightweight retrieval (Go) with... |
|
Experimental |
| 31 |
cosmanBrenden/DocumentMuncher
DocumentMuncher is a locally running document seach engine that allows you... |
|
Experimental |
| 32 |
KaavyaGala546/DocuMind-AI
DocuMind-AI is an AI-powered document assistant that allows users to upload... |
|
Experimental |
| 33 |
akbar-ops/sistema-de-analisis-de-documentos-juridicos
đź“„ Analyze, classify, and search legal documents with advanced NLP techniques... |
|
Experimental |
| 34 |
David-mwas/vidmindAI
VIDMIND is a system designed to automatically summarize, analyze, and... |
|
Experimental |
| 35 |
Helixo613/docforensics
Cross-document contradiction and agreement detection for PDF collections... |
|
Experimental |
| 36 |
Irshad-11/PDF-INSIGHTS
Smart PDF Analyzer with OCR and Semantic Search |
|
Experimental |
| 37 |
tstephx/book-ingestion-python
Book ingestion pipeline for processing PDF/EPUB into searchable chapters... |
|
Experimental |
| 38 |
mamoon-17/DocuQuery
DocuQuery — a minimal RAG demo: upload PDFs, generate local embeddings,... |
|
Experimental |
| 39 |
bivex/qdrant_streamlit_generator_via_groq
🔍 QDRANT + STREAMLIT + GROQ = VECTOR SEARCH UI. Explore embeddings.... |
|
Experimental |
| 40 |
devinitive-team/mirage
🏜️ Mirage: Universal, relevance search over PDF documents at any scale.... |
|
Experimental |
| 41 |
maharishiayurveda/DocQuify
Extract insights from research papers with DocQuify. Upload PDFs and ask... |
|
Experimental |
| 42 |
KishoreMuruganantham/HackRx-6.0-Intelligent-Query-Retrieval
LLM-powered system for intelligent query–retrieval from large documents in... |
|
Experimental |
| 43 |
kstv364/intellidoc
Hackathon project - Intellidoc - ECM MVP with semantic search capabilities.... |
|
Experimental |
| 44 |
dyannadle/AI-Powered-Search-Over-Noation
An AI-powered document search engine that connects to Notion and Google... |
|
Experimental |
| 45 |
hlw-aryan/DocuMate
Unlock the true potential of your document assets with DocuMate's... |
|
Experimental |
| 46 |
Mielone2Good/DocVision-AI
Intelligent PDF Document Understanding System with semantic document search... |
|
Experimental |
| 47 |
naKarthikSurya/Legal-AI-Model
An AI-powered Legal Information Retrieval System for Indian Laws and Court... |
|
Experimental |