sauradip/fewshotQAT
[BMVC 2021]: Official PyTorch implementation of : "Few Shot Temporal Action Localization using Query Adaptive Transformers"
This tool helps video analysts and researchers quickly identify specific actions within long, untrimmed videos, even with very few example videos to learn from. You provide a small set of videos demonstrating the action you're looking for, and it pinpoints where and when that action occurs in much larger video collections. This is ideal for those studying human behavior, sports, or surveillance footage with limited labeled data.
No commits in the last 6 months.
Use this if you need to find specific actions in video footage but have very few labeled examples, especially when dealing with long, unedited videos.
Not ideal if you have extensive, precisely annotated datasets for your actions, as this tool is designed for 'few-shot' learning scenarios.
Stars
20
Forks
3
Language
Python
License
—
Category
Last pushed
Jul 12, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/sauradip/fewshotQAT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NVlabs/MambaVision
[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
sign-language-translator/sign-language-translator
Python library & framework to build custom translators for the hearing-impaired and translate...
kyegomez/Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
autonomousvision/transfuser
[PAMI'23] TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving;...
kyegomez/MultiModalMamba
A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance...