liveseongho/Awesome-Video-Language-Understanding
A Survey on video and language understanding.
This is a curated list of recent advancements in the field of video and language understanding. It compiles research papers and code repositories for various tasks such as video retrieval, question answering, and captioning, along with relevant datasets. Researchers and practitioners in AI and machine learning will find this useful for exploring the state-of-the-art in multimodal video analysis.
No commits in the last 6 months.
Use this if you are a researcher or AI developer looking for the latest techniques and resources to build systems that understand and process video content based on natural language.
Not ideal if you are a business user seeking a ready-to-use application or a non-technical person looking for a general overview of video analysis.
Stars
50
Forks
2
Language
—
License
MIT
Category
Last pushed
Apr 21, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/liveseongho/Awesome-Video-Language-Understanding"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-mmlab/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
facebookresearch/mmf
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
adambielski/siamese-triplet
Siamese and triplet networks with online pair/triplet mining in PyTorch
HuaizhengZhang/Awsome-Deep-Learning-for-Video-Analysis
Papers, code and datasets about deep learning and multi-modal learning for video analysis
KaiyangZhou/pytorch-vsumm-reinforce
Unsupervised video summarization with deep reinforcement learning (AAAI'18)