illuin-tech/colpali

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

/ 100

Established

This project helps you find specific documents quickly by understanding both their text and visual layout. You feed it a collection of documents and your search query, and it outputs the most relevant documents, even considering elements like charts and diagrams. It's designed for anyone who needs to retrieve information from visually rich documents, like legal professionals, researchers, or data analysts.

2,555 stars. Actively maintained with 4 commits in the last 30 days.

Use this if you need to efficiently search and retrieve information from documents where visual layout, images, or charts are as important as the text itself.

Not ideal if your documents are purely text-based and you do not need to consider visual information for retrieval.

document-retrieval information-extraction legal-research academic-research knowledge-management

No Package No Dependents

Maintenance 13 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

2,555

Forks

236

Language

Python

License

MIT

Related tools

AnswerDotAI/byaldi

Use late-interaction multi-modal models such as ColPali in just a few lines of code.

jolibrain/colette

Multimodal RAG to search and interact locally with technical documents of any kind

nannib/nbmultirag

Un framework in Italiano ed Inglese, che permette di chattare con i propri documenti in RAG,...

OpenBMB/VisRAG

Parsing-free RAG supported by VLMs

chiang-yuan/llamp

[EMNLP '25] A web app and Python API for multi-modal RAG framework to ground LLMs on...

Explore RAG Tools

All categories Trending RAG directory Insights