CLIP Vision Language ML Frameworks

Implementations, adaptations, and applications of CLIP and similar vision-language models for zero-shot classification, image-text matching, and multimodal tasks. Does NOT include other vision-language models (like BLIP or LLaVA), general multimodal frameworks, or unrelated CLIPS language systems.

There are 46 clip vision language frameworks tracked. 1 score above 70 (verified tier). The highest-rated is mlfoundations/open_clip at 73/100 with 13,496 stars. 2 of the top 10 are actively maintained.

Get all 46 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=clip-vision-language&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Framework Score Tier
1 mlfoundations/open_clip

An open source implementation of CLIP.

73
Verified
2 noxdafox/clipspy

Python CFFI bindings for the 'C' Language Integrated Production System CLIPS

62
Established
3 openai/CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant...

60
Established
4 moein-shariatnia/OpenAI-CLIP

Simple implementation of OpenAI CLIP model in PyTorch.

53
Established
5 BioMedIA-MBZUAI/FetalCLIP

Official repository of FetalCLIP: A Visual-Language Foundation Model for...

52
Established
6 filipbasara0/simple-clip

A minimal, but effective implementation of CLIP (Contrastive Language-Image...

50
Established
7 cliport/cliport

CLIPort: What and Where Pathways for Robotic Manipulation

49
Emerging
8 WolodjaZ/MSAE

Interpreting CLIP with Hierarchical Sparse Autoencoders (ICML 2025)

47
Emerging
9 Dalageo/paperclip-inspection

Analyzing Paper Clips Using Deep Learning and Computer Vision Techniques 📎

44
Emerging
10 SunzeY/AlphaCLIP

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

43
Emerging
11 LeapLabTHU/Cross-Modal-Adapter

[Pattern Recognition 2025] Cross-Modal Adapter for Vision-Language Retrieval

40
Emerging
12 SiddhantBikram/MemeCLIP

Official Repository for the paper 'MemeCLIP: Leveraging CLIP Representations...

39
Emerging
13 jaisidhsingh/CoN-CLIP

Implementation of the "Learn No to Say Yes Better" paper.

39
Emerging
14 noxdafox/iclips

CLIPS Jupyter console

39
Emerging
15 lakeraai/onnx_clip

An ONNX-based implementation of the CLIP model that doesn't depend on torch...

39
Emerging
16 kevinzakka/clip_playground

An ever-growing playground of notebooks showcasing CLIP's impressive...

38
Emerging
17 merveenoyan/siglip

Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers...

38
Emerging
18 svpino/clip-container

A containerized REST API around OpenAI's CLIP model.

36
Emerging
19 UCSC-VLAA/CLIPA

[NeurIPS 2023] This repository includes the official implementation of our...

36
Emerging
20 sixu0/SeisCLIP

The code of Paper 'SeisCLIP: A seismology foundation model pre-trained by...

35
Emerging
21 Mauville/MedCLIP

Medical image captioning using OpenAI's CLIP

35
Emerging
22 sarthaxxxxx/BATCLIP

[ICCV '25] BATCLIP: Bimodal Online Test-Time Adaptation for CLIP

35
Emerging
23 aygong/ClipMind

Code for the paper "ClipMind: A Framework for Auditing Short-Format Video...

34
Emerging
24 RobertBiehl/CLIP-tf2

OpenAI CLIP converted to Tensorflow 2/Keras

33
Emerging
25 bes-dev/pytorch_clip_bbox

Pytorch based library to rank predicted bounding boxes using text/image...

33
Emerging
26 bes-dev/pytorch_clip_guided_loss

A simple library that implements CLIP guided loss in PyTorch.

31
Emerging
27 KeremTurgutlu/clip_art

CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification - 4th...

30
Emerging
28 LAION-AI/scaling-laws-openclip

Reproducible scaling laws for contrastive language-image learning...

30
Emerging
29 halixness/understanding-CLIP

Repo from the "Learning with limited labeled data" seminar @ Uni of...

29
Experimental
30 CoderChen01/InterCLIP-MEP

Official repository of the paper "InterCLIP-MEP: Interactive CLIP and...

28
Experimental
31 zjunlp/SPEECH

[ACL 2023] SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres

27
Experimental
32 ExcelsiorCJH/CLIP

CLIP: Learning Transferable Visual Models From Natural Language Supervision

27
Experimental
33 your-ai-solution/generation-image-caption

This application fine-tunes the CLIP model on the Flickr8k dataset to align...

25
Experimental
34 Evfidiw/MoBA

[ACMMM'24] MoBA: Mixture of Bi-directional Adapter for Multi-modal Sarcasm Detection

24
Experimental
35 D0miH/does-clip-know-my-face

Source Code for the JAIR Paper "Does CLIP Know my Face?" (Demo:...

22
Experimental
36 Fr0zenCrane/Cockatiel

The official implementation of our paper "Cockatiel: Ensembling Synthetic...

20
Experimental
37 MingliangLiang3/GLIP

Centered Masking for Language-Image Pre-training

20
Experimental
38 rhysdg/vision-at-a-clip

Low-latency ONNX and TensorRT based zero-shot classification and detection...

19
Experimental
39 A-SHOJAEI/multimodal-contrastive-captioning-with-preference-aligned-generation

Vision-language model combining CLIP-style contrastive learning with...

19
Experimental
40 jonkahana/CLIPPR

An official PyTorch implementation for CLIPPR

19
Experimental
41 nicolafan/clipper

Explore your CLIP embeddings in a bidimensional space

18
Experimental
42 smb-h/mqirtn

Multimodal Query Enhancement for Image Retrieval using Transformer Networks (MQIRTN)

17
Experimental
43 ImtiazShuvo/clip-lora-food101-classification

Transfer learning and parameter-efficient fine-tuning of CLIP on the...

14
Experimental
44 Bijay-kumar-sethy/clip

🔍 Solve linear programming problems efficiently with Clp, an open-source...

13
Experimental
45 KeithLin724/HAR_Clip

Human Action Recognition using Clip

13
Experimental
46 MaharshPatelX/qwen-clip-multimodal

Multimodal Vision-AI: CLIP eyes + Qwen2.5 brain, 155 K-step pipeline & demo.

13
Experimental