aimagelab/ReT-2

Recurrence Meets Transformers for Universal Multimodal Retrieval

/ 100

Experimental

This project helps researchers and data scientists efficiently search through large collections of mixed text and image data. You provide it with a query (which can be text, an image, or both) and a dataset of documents that contain both text and images. It then returns the most relevant documents, enabling tasks like answering complex questions or finding related content across different media types.

Use this if you need to find specific information or content by searching with both text and images against a vast, diverse dataset of multimodal documents.

Not ideal if your data is exclusively text-based or image-based, or if you need to perform quick, ad-hoc searches on a small, simple collection.

multimodal-search information-retrieval knowledge-base-query visual-question-answering content-discovery

No Package No Dependents

Maintenance 6 / 25

Adoption 6 / 25

Maturity 15 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

Apache-2.0

Higher-rated alternatives

illuin-tech/colpali

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

AnswerDotAI/byaldi

Use late-interaction multi-modal models such as ColPali in just a few lines of code.

jolibrain/colette

Multimodal RAG to search and interact locally with technical documents of any kind

nannib/nbmultirag

Un framework in Italiano ed Inglese, che permette di chattare con i propri documenti in RAG,...

OpenBMB/VisRAG

Parsing-free RAG supported by VLMs

Explore RAG Tools

All categories Trending RAG directory Insights