GlitchBench/Benchmark
Code and Data for GlitchBench
This project helps evaluate how well large AI models can spot unusual or 'glitched' situations in video game videos. It takes video game footage as input and assesses if an AI model can identify and explain unexpected events or strange visual compositions. Anyone working with or developing large AI models for visual understanding, especially in complex or unusual scenarios, would use this.
No commits in the last 6 months.
Use this if you need to benchmark the common-sense reasoning and visual recognition abilities of large multimodal AI models, particularly their capacity to detect anomalies in video game content.
Not ideal if you are looking for a tool to develop new AI models or directly apply AI for real-time glitch detection in games, as this is purely for academic benchmarking.
Stars
13
Forks
—
Language
Python
License
—
Category
Last pushed
Feb 27, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/GlitchBench/Benchmark"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TsinghuaC3I/MARTI
A Framework for LLM-based Multi-Agent Reinforced Training and Inference
zjunlp/KnowLM
An Open-sourced Knowledgable Large Language Model Framework.
cli99/llm-analysis
Latency and Memory Analysis of Transformer Models for Training and Inference
tanyuqian/redco
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to...
stanleylsx/llms_tool
一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。