CLIP Multimodal Search NLP Tools

Tools for searching and retrieving images, videos, or multimodal content using CLIP-based vision-language models and text/image queries. Does NOT include general image captioning, visual question answering without search functionality, or non-CLIP multimodal architectures.

There are 20 clip multimodal search tools tracked. 1 score above 50 (established tier). The highest-rated is ClipsAI/clipsai at 59/100 with 455 stars.

Get all 20 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=clip-multimodal-search&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 ClipsAI/clipsai

Clips AI is an open-source Python library that automatically converts long...

59
Established
2 ai-forever/ru-clip

CLIP implementation for Russian language

47
Emerging
3 patrickjohncyh/fashion-clip

FashionCLIP is a CLIP-like model fine-tuned for the fashion domain.

43
Emerging
4 Lednik7/CLIP-ONNX

It is a simple library to speed up CLIP inference up to 3x (K80 GPU)

42
Emerging
5 suinleelab/CellCLIP

[NeurIPS 2025] CellCLIP – Learning Perturbation Effects in Cell Painting via...

37
Emerging
6 cene555/ruCLIP-SB

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a...

35
Emerging
7 GuyARoss/CLIP-video-search

demo natural language video db using CLIP

31
Emerging
8 emerisly/EDIS

Entity-Driven Image Search over Multimodal Web Content (EMNLP 2023)

27
Experimental
9 DeliriumV01D/RuCLIP

Unofficial c++ LibTorch implementation of RuCLIP (Sber AI)

27
Experimental
10 DevMilk/ImageOps

Reverse Image Based Entity Search Tool

25
Experimental
11 DARK-art108/Image-Search-Using-CLIP-VIT

A powerful image search using CLIP (Contrastive Language-Image Pre-Training)...

25
Experimental
12 sugarandgugu/Text2Image-Retrieval

计算机视觉课程设计-基于Chinese-CLIP的图文检索系统

24
Experimental
13 krishnaura45/ImageQuant

🤖Combining 🔠NLP and 🧠 Deep Learning for 📦Image-Based Entity Extraction

20
Experimental
14 maliha-usui/cross-lingual-clip-memes

Cross-lingual evaluation of CLIP on Japanese vs English memes — revealing...

19
Experimental
15 gangula-karthik/AICU-BIKE-SEARCH

Find Your Stolen Bike Lah! With AICU, We Kena Spot Your Bicycle on Carousell...

19
Experimental
16 saadkh1/clip_dual_encoder

Visual and Vision-Language Representation Pre-Training with Contrastive Learning

18
Experimental
17 QQBrowserVideoSearch/CBVS-UniCLIP

A Large-Scale Chinese Image-Text Benchmark for Real-World Short Video Search...

13
Experimental
18 ubaidkhan08/CLIFS-Contrastive-Language-Image-Forensic-Search

CLIFS (CLIP-based Frame Selection) is a Python function that takes in a...

12
Experimental
19 fortierq/image-retrieval

Retrieve images from a text query. Based on deep learning (CNN) and word embedding.

10
Experimental
20 KernelA/clip-text-search

Search images by text input with CLIP

10
Experimental