mbzuai-oryx/Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

47
/ 100
Emerging

This project helps anyone who needs to understand and discuss the content of videos by generating descriptive conversations. You input a video, and it outputs a meaningful conversation about what's happening, allowing you to ask questions and get detailed answers. It's designed for professionals like content analysts, educators, or researchers who need to quickly grasp and articulate video information.

1,497 stars. No commits in the last 6 months.

Use this if you need to quickly extract detailed information from videos through natural language conversations, without manually watching and transcribing.

Not ideal if your primary need is for simple video transcription or object detection without deeper conversational understanding.

video-analysis content-understanding media-research conversational-AI digital-asset-management
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

1,497

Forks

127

Language

Python

License

CC-BY-4.0

Last pushed

Aug 05, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/mbzuai-oryx/Video-ChatGPT"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.