wjun0830/QD-DETR
Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 Paper)
This project helps video content creators, marketers, or media analysts pinpoint specific moments within long videos and automatically detect highlights. You provide a video and a text query (like "goal shot" or "product launch"), and it outputs the exact time segments in the video that match your query or stand out as highlights. This tool is for anyone needing to efficiently extract key information or create summaries from video footage.
246 stars. No commits in the last 6 months.
Use this if you need to quickly find specific events or automatically identify important segments within a video based on a textual description.
Not ideal if you're looking to analyze static images, process only audio, or if your primary goal is general video editing rather than content retrieval or highlight detection.
Stars
246
Forks
20
Language
Python
License
—
Category
Last pushed
Aug 12, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/wjun0830/QD-DETR"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
BR-IDL/PaddleViT
:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+
pathak22/unsupervised-video
[CVPR 2017] Unsupervised deep learning using unlabelled videos on the web
IBM/CrossViT
Official implementation of CrossViT. https://arxiv.org/abs/2103.14899
NVlabs/GCVit
[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers
ViTAE-Transformer/ViTDet
Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object...