Tanveer81/ReVisionLLM

This is the official implementation of ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos

35
/ 100
Emerging

This project helps video analysts, content creators, or researchers quickly find specific events within very long videos, even those hours in length. You provide a long video and a text description of what you're looking for, and it precisely identifies the start and end times of that event. It's designed for anyone who needs to pinpoint exact moments in extensive video footage without manually scrubbing through everything.

Use this if you need to precisely locate specific events or actions described by text within videos that can be several minutes to many hours long.

Not ideal if your videos are very short (a few seconds) or if you only need to identify broad categories of content rather than specific temporal boundaries.

video-analysis content-moderation footage-review multimedia-search event-detection
No Package No Dependents
Maintenance 6 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 5 / 25

How are scores calculated?

Stars

43

Forks

2

Language

Python

License

Last pushed

Nov 05, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Tanveer81/ReVisionLLM"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.