kyopark2014/llm-multimodal-and-rag

It shows how to use mutimodal and RAG based on multi-region LLM.

/ 100

Experimental

This project helps developers build intelligent applications that can understand both text and images. It takes raw data, including visual content, and uses large language models to provide enriched, context-aware responses. Developers can use this to create robust chatbots and AI assistants capable of handling diverse information.

No commits in the last 6 months.

Use this if you are a developer looking to build a generative AI application that needs to process both images and text, leveraging multiple LLMs for higher performance and reliability.

Not ideal if you are an end-user without programming experience, as this is a toolkit for developers to build applications, not a ready-to-use solution.

generative-ai chatbot-development multimodal-ai llm-operations serverless-architecture

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

illuin-tech/colpali

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

AnswerDotAI/byaldi

Use late-interaction multi-modal models such as ColPali in just a few lines of code.

jolibrain/colette

Multimodal RAG to search and interact locally with technical documents of any kind

nannib/nbmultirag

Un framework in Italiano ed Inglese, che permette di chattare con i propri documenti in RAG,...

OpenBMB/VisRAG

Parsing-free RAG supported by VLMs

Explore RAG Tools

All categories Trending RAG directory Insights