bytedance/Shot2Story
A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.
This project helps video content creators, marketers, or researchers quickly understand the narrative of multi-shot videos. You provide a video, and it generates a detailed textual summary of the entire video and captions for individual shots, including both visual and spoken elements. It is designed for anyone who needs to extract comprehensive textual descriptions from video content.
168 stars. No commits in the last 6 months.
Use this if you need to generate detailed textual summaries and individual shot captions for multi-shot videos to quickly grasp their content.
Not ideal if you are looking for a tool to edit videos or perform complex video analytics beyond text generation.
Stars
168
Forks
9
Language
Python
License
—
Category
Last pushed
Jan 30, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/bytedance/Shot2Story"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.