AhmedAl93/multimodal-semantic-RAG
A RAG system designed to process documents with multimodal content. It can generate factual, context-aware answers to user queries, based on the documents texts, tables, figures, ...
This tool helps you quickly get factual, context-aware answers from your documents, even if they contain complex information like charts, tables, and images, not just plain text. You input one or more PDF documents, and it allows you to ask questions to get concise, accurate answers drawn directly from your content. Anyone who needs to extract specific information from detailed reports, research papers, or technical manuals would find this useful.
No commits in the last 6 months.
Use this if you need to find specific answers within documents that mix text with visual data like graphs and tables, and you're tired of manually sifting through pages.
Not ideal if your documents are purely text-based and simple, or if you need to process file types other than PDFs.
Stars
26
Forks
2
Language
Python
License
MIT
Category
Last pushed
Dec 13, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/AhmedAl93/multimodal-semantic-RAG"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
illuin-tech/colpali
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.
AnswerDotAI/byaldi
Use late-interaction multi-modal models such as ColPali in just a few lines of code.
jolibrain/colette
Multimodal RAG to search and interact locally with technical documents of any kind
nannib/nbmultirag
Un framework in Italiano ed Inglese, che permette di chattare con i propri documenti in RAG,...
OpenBMB/VisRAG
Parsing-free RAG supported by VLMs