astra-vision/LatteCLIP

[WACV 2025] LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts

28
/ 100
Experimental

This project helps machine learning engineers and researchers improve the performance of vision-language models, specifically CLIP, on custom image datasets without needing manual data labeling. It takes your unlabelled image datasets and automatically generates descriptive texts using advanced language models. The output is a fine-tuned CLIP model that understands your specific images better, ready for tasks like image classification or search.

No commits in the last 6 months.

Use this if you have a unique image dataset and want to boost a CLIP model's accuracy on it, but lack the resources for extensive manual text labeling.

Not ideal if you don't work with deep learning models or don't have access to substantial GPU compute resources.

computer-vision machine-learning-engineering image-recognition model-fine-tuning unsupervised-learning
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 7 / 25

How are scores calculated?

Stars

10

Forks

1

Language

Jupyter Notebook

License

Last pushed

Jan 27, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/astra-vision/LatteCLIP"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.