OpenGVLab/Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

/ 100

Emerging

This project allows you to ask questions about the content of videos and get natural language answers. You input a video file and your questions, and it provides descriptive text responses about actions, objects, and events within the video. Anyone who needs to quickly understand or summarize video content, such as content analysts, researchers, or educators, would find this useful.

3,335 stars. No commits in the last 6 months.

Use this if you need to extract specific information or generate a summary from video footage by simply asking questions in plain English.

Not ideal if you primarily need precise timestamped event detection or highly detailed, frame-by-frame analysis.

video-analysis content-moderation media-research video-search lecture-analysis

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

3,335

Forks

268

Language

Python

License

MIT

Higher-rated alternatives

jingyaogong/minimind-v

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM！🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

SkyworkAI/Skywork-R1V

Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in...

roboflow/vision-ai-checkup

Take your LLM to the optometrist.

zai-org/GLM-TTS

GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning

NExT-GPT/NExT-GPT

Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model

Explore LLM Tools

All categories Trending LLM Tool directory Insights