Fr0zenCrane/Cockatiel
The official implementation of our paper "Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption"
Cockatiel helps generate highly detailed and human-preferred captions for videos, going beyond simple descriptions to capture fine-grained aspects like objects, camera movements, and background details. You provide a video file, and it outputs a rich, descriptive caption. This tool is ideal for researchers, content creators, or analysts who need precise, nuanced textual descriptions of video content.
No commits in the last 6 months.
Use this if you need to automatically generate detailed, human-aligned descriptions for a large collection of videos, capturing specific aspects of the video content beyond basic summaries.
Not ideal if you only need very short, generic captions or if your primary need is real-time video transcription without detailed scene analysis.
Stars
38
Forks
1
Language
Python
License
—
Category
Last pushed
May 21, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Fr0zenCrane/Cockatiel"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
mlfoundations/open_clip
An open source implementation of CLIP.
noxdafox/clipspy
Python CFFI bindings for the 'C' Language Integrated Production System CLIPS
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
moein-shariatnia/OpenAI-CLIP
Simple implementation of OpenAI CLIP model in PyTorch.
BioMedIA-MBZUAI/FetalCLIP
Official repository of FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis