Open-Social-World/EgoNormia
EgoNormia | Benchmarking Physical Social Norm Understanding in VLMs
This tool helps AI researchers and developers assess how well their Vision-Language Models (VLMs) understand and reason about physical social norms in real-world scenarios. You provide your VLM and a dataset of social interaction scenarios, and it produces a benchmark score indicating the model's performance in interpreting these situations. It's designed for researchers working on improving the social intelligence of AI agents.
No commits in the last 6 months.
Use this if you are developing or evaluating Vision-Language Models and need a standardized way to measure their understanding of human social behavior in physical environments.
Not ideal if you are looking for a tool to develop or train new VLMs, as this is solely for benchmarking existing models.
Stars
12
Forks
1
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Jun 18, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Open-Social-World/EgoNormia"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
stanfordnlp/axbench
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
aidatatools/ollama-benchmark
LLM Benchmark for Throughput via Ollama (Local LLMs)
LarHope/ollama-benchmark
Ollama based Benchmark with detail I/O token per second. Python with Deepseek R1 example.
qcri/LLMeBench
Benchmarking Large Language Models
THUDM/LongBench
LongBench v2 and LongBench (ACL 25'&24')