chasemetoyer/gameplay-vision-llm
Multimodal gameplay video understanding system combining vision, audio, and language models to enable long-horizon reasoning and question-answering over complex game environments.
This project helps video game analysts, testers, and content creators understand complex gameplay by analyzing video and audio footage. It takes raw gameplay video as input and provides detailed answers to natural language questions about in-game events, player actions, and strategic outcomes. This is designed for anyone needing deep insights into game performance and mechanics without manual frame-by-frame analysis.
Use this if you need to deeply analyze gameplay videos, understand 'why' certain events happened, track player strategies, or generate detailed summaries of long play sessions.
Not ideal if you're looking for a simple video editing tool or if your primary need is general video content recognition outside of game environments.
Stars
9
Forks
1
Language
Python
License
MIT
Category
Last pushed
Dec 17, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/chasemetoyer/gameplay-vision-llm"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
llmware-ai/llmware
Unified framework for building enterprise RAG pipelines with small, specialized models
Sinapsis-AI/sinapsis-chatbots
Monorepo for sinapsis templates supporting LLM based Agents
aimclub/ProtoLLM
Framework for prototyping of LLM-based applications
Azure-Samples/azureai-foundry-finetuning-raft
A recipe that will walk you through using either Meta Llama 3.1 405B or OpenAI GPT-4o deployed...
xi029/Qwen3-VL-MoeLORA
在千问最新的多模态image-text模型Qwen3-VL-4B-Instruct 进行多种lora微调对比效果,通过langchain+RAG+多智能体(Multi-Agent)进行部署