ShafqaatMalik/multimodal_document_search
A powerful multimodal document search engine that converts PDF documents into searchable vector embeddings using OpenAI's CLIP model. Enables cross-modal search across text and images with natural language queries through a modern Streamlit interface.
Stars
1
Forks
—
Language
Python
License
—
Category
Last pushed
Oct 23, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/ShafqaatMalik/multimodal_document_search"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
unum-cloud/UForm
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts,...
rom1504/clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
mazzzystar/Queryable
Run OpenAI's CLIP and Apple's MobileCLIP model on iOS to search photos.
s-emanuilov/litepali
LitePali is a minimal, efficient implementation of ColPali for image retrieval and indexing,...
slavabarkov/tidy
Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized...