mlfoundations/open_clip

An open source implementation of CLIP.

73
/ 100
Verified

This project provides pre-trained models that understand both images and text, allowing you to connect what you see with descriptive phrases. You can input an image and a list of text descriptions to get back probabilities of which description best matches the image. This is ideal for researchers or developers building applications that need to categorize images based on natural language or search for images using text.

13,496 stars. Used by 18 other packages. Actively maintained with 1 commit in the last 30 days. Available on PyPI.

Use this if you need to build applications that classify images using text descriptions or retrieve images based on text queries, without requiring extensive, custom image labeling.

Not ideal if your primary goal is fine-tuning a pre-trained model for traditional image classification tasks on specific datasets, as a separate repository is recommended for that.

image-text-matching zero-shot-classification multimodal-search computer-vision natural-language-processing
Maintenance 13 / 25
Adoption 15 / 25
Maturity 25 / 25
Community 20 / 25

How are scores calculated?

Stars

13,496

Forks

1,253

Language

Python

License

Last pushed

Mar 12, 2026

Commits (30d)

1

Dependencies

8

Reverse dependents

18

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/mlfoundations/open_clip"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.