Skyline-9/Visionary-Vids

Multi-modal transformer approach for natural language query based joint video summarization and highlight detection

31
/ 100
Emerging

This project helps video content creators and editors quickly identify and extract the most important segments from long video footage. You provide a natural language query describing what you're looking for, and it outputs precise video clips that match your description and highlight key moments. This is ideal for anyone who needs to efficiently create shorter versions of videos or find specific events within them.

No commits in the last 6 months.

Use this if you need to rapidly summarize long videos or pinpoint specific highlights using simple text descriptions, without manually scrubbing through footage.

Not ideal if you primarily work with image data, require extremely granular frame-by-frame editing, or are not comfortable with command-line tools for setup and operation.

video-editing content-creation media-management video-summarization event-detection
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 9 / 25

How are scores calculated?

Stars

17

Forks

2

Language

Jupyter Notebook

License

Last pushed

May 23, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Skyline-9/Visionary-Vids"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.