BLIP Image Captioning Transformer Models

End-to-end image captioning systems using BLIP models, including web interfaces, fine-tuning, batch processing, and caption generation. Does NOT include general vision-language models, CLIP embeddings, or non-captioning vision tasks like classification or object detection.

There are 23 blip image captioning models tracked. 2 score above 50 (established tier). The highest-rated is label-sleuth/label-sleuth at 56/100 with 271 stars.

Get all 23 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=blip-image-captioning&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 label-sleuth/label-sleuth

Open source no-code system for text annotation and building of text classifiers

56
Established
2 CVHub520/X-AnyLabeling-Server

A Simple, Lightweight, and Extensible Serving Framework for X-AnyLabeling

51
Established
3 antoninodimaggio/Hugging-Captions

Generate realistic Instagram captions using transformers 🤗

34
Emerging
4 VisioSphereAI/labelvim

This is a python based standalone image annotation tool designed for tasks...

32
Emerging
5 hem9984/Dataset-label

This will allow you to choose your labels, and then label every image in a...

31
Emerging
6 Merserk/Caption-Creator

Caption Creator is a fast and portable tool for generating high-quality...

29
Experimental
7 FuxiaoLiu/VisualNews-Repository

[EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning

28
Experimental
8 dmdin/SceneDescriptor

🎞 Video editor with description generation for MTS TrueTech Hack

26
Experimental
9 eray-yuztyurk/python-ai-image-captioning

AI-powered image captioning app using the BLIP model. Instantly generate...

21
Experimental
10 ash-01xor/Imgcap

A CLI to generate captions for images

19
Experimental
11 spongedsc/SpongeML

SocketIO server providing a CharacterAI proxy and Image Captioning

18
Experimental
12 mozartsempiano/psykos

Bot that fetches random images from Tumblr, analyzes their aesthetics, and...

17
Experimental
13 mahalrs/newsgen

Multi-Modal Image Generation for News Stories

17
Experimental
14 tharun-ship-it/image-to-text-generator

🖼️BLIP-powered Image-to-Text Generator achieving 136.7 CIDEr score on...

17
Experimental
15 Asimo-o/blipren_release

🚀 Train any LLM with BLIPren, a flexible architecture that adapts to your...

15
Experimental
16 eren23/blipren_release

BLIP-2 implementation for training vision-language models. Q-Former + frozen...

14
Experimental
17 coffeedrunkpanda/multimodal-api

A FastAPI service that leverages BLIP-2 transformer models for image...

13
Experimental
18 enigmatronix13/Neural-Style-Transfer

Flask-based web app that performs Neural Style Transfer (NST) using...

13
Experimental
19 ai-art-dev99/vision-language-caption-vqa

End-to-end BLIP + LLaVA project for image captioning and VQA with...

13
Experimental
20 devtitus/Image-Caption-with-Pretrained-model

A simple yet powerful image captioning application that uses Salesforce's...

13
Experimental
21 Laasya-online/Multimodal-Captioner

Image captioning lab with BLIP.

12
Experimental
22 AbdullahAlokayl/image-to-text-inspirations

Fine-Tuning BLIP for Image Captioning A project to fine-tune the BLIP model...

11
Experimental
23 Nauman123-coder/Automated-Image-Captioner

An end-to-end image captioning system using the BLIP (Bootstrapping...

10
Experimental