ivonajdenkoska/tulip

[ICLR 2025] Official code repository for "TULIP: Token-length Upgraded CLIP"

37
/ 100
Emerging

This project helps researchers and developers working with vision-language models to better understand and generate content based on longer, more descriptive text. It takes existing CLIP-like models and upgrades their ability to process lengthy captions, resulting in improved performance for tasks like accurately matching images to long descriptions or creating images from detailed text prompts. Anyone building or training advanced AI models for multimodal understanding, especially those dealing with complex visual scenes and rich textual narratives, would use this.

Use this if you need to improve the performance of your existing CLIP-like models on tasks involving long, descriptive image captions or detailed text prompts for image generation.

Not ideal if your primary need is for short, simple image-text matching or if you are not working with advanced deep learning models.

AI model training multimodal AI image understanding natural language processing computer vision
No Package No Dependents
Maintenance 10 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 4 / 25

How are scores calculated?

Stars

33

Forks

1

Language

Python

License

Apache-2.0

Last pushed

Jan 26, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ivonajdenkoska/tulip"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.