CLIP Image Embeddings Transformer Models

Tools for generating and working with CLIP image-text embeddings, including implementations, fine-tuning, and lightweight variants. Does NOT include general vision-language models, text-to-image generation, or multimodal fusion frameworks.

There are 22 clip image embeddings models tracked. The highest-rated is OFA-Sys/Chinese-CLIP at 48/100 with 5,820 stars.

Get all 22 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=clip-image-embeddings&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 OFA-Sys/Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and...

48
Emerging
2 Kaushalya/medclip

A multi-modal CLIP model trained on the medical dataset ROCO

44
Emerging
3 kastalimohammed1965/CLIP-fine-tune-registers-gated

Vision Transformers Needs Registers. And Gated MLPs. And +20M params. Tiny...

43
Emerging
4 BUAADreamer/SPN4CIR

[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning...

35
Emerging
5 clip-italian/clip-italian

CLIP (Contrastive Language–Image Pre-training) for Italian

32
Emerging
6 zer0int/CLIP-fine-tune-registers-gated

Vision Transformers Needs Registers. And Gated MLPs. And +20M params. Tiny...

29
Experimental
7 Armaggheddon/ClipServe

🚀 ClipServe: A fast API server for embedding text, images, and performing...

28
Experimental
8 kyegomez/MuonClip

This repository is an open source implementation of the MuonClip strategy...

27
Experimental
9 FuxiaoLiu/DocumentCLIP

[ICPRAI 2024] DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents

22
Experimental
10 taherfattahi/MetaWorld-VLA-openai-clip-vit

A lightweight Vision-Language-Action (VLA) baseline for MetaWorld robot-arm...

22
Experimental
11 YUSH19883/cog-jinaai-jina-clip-v2

🖼️ Generate high-quality multimodal embeddings for text and images with Jina...

21
Experimental
12 MuhammadAliS/CLIP

PyTorch implementation of OpenAI's CLIP model for image classification,...

19
Experimental
13 corentin-ryr/CLIP-mixer

Implementation of CLIP using a Mixer architecture

19
Experimental
14 VijayPrakashReddy-k/CLIP-PACL

Contrastive Language - Image Pre-training (CLIP) and Patch Aligned...

19
Experimental
15 zsxkib/cog-jinaai-jina-clip-v2

Jina CLIP v2 - Multimodal embedding model for text and images with...

18
Experimental
16 ptmorris03/CLIPEmbedding

Easy text-image embedding and similarity with pretrained CLIP in PyTorch

17
Experimental
17 seanghay/clipsort

Group images by provided labels using OpenAI/CLIP

17
Experimental
18 SuryaAnything/V-DeClip

Masked Multi-Component Gated Decomposition Architecture

17
Experimental
19 theSohamTUmbare/CLIP-model

Reimplementation of the CLIP model

15
Experimental
20 ntat/Lightweight_CLIP_model

A lightweight Pytorch implementation of OpenAI's CLIP model.

13
Experimental
21 Rakshath66/ClipFindr

🔍 A CLIP-powered image similarity finder built with Streamlit — upload a...

13
Experimental
22 monatis/turkish-clip

Embed texts in Turkish to be used with OpenAI's CLIP

12
Experimental