jiayuww/SpatialEval

[NeurIPS'24] SpatialEval: a benchmark to evaluate spatial reasoning abilities of MLLMs and LLMs

23
/ 100
Experimental

SpatialEval helps AI researchers and developers assess how well large language models (LLMs) and vision-language models (VLMs) understand spatial concepts. You feed it a model you want to test and a set of spatial reasoning questions (text, image, or both), and it outputs a performance evaluation of your model's ability to grasp spatial relationships, object positions, counting, and navigation scenarios. This is for anyone building or evaluating advanced AI models that need to interact with spatial information.

No commits in the last 6 months.

Use this if you are a researcher or AI developer working on large language models or vision-language models and need a standardized way to benchmark their spatial reasoning capabilities.

Not ideal if you are a general user looking to solve a specific business problem, as this is a research-focused benchmark tool for AI model evaluation.

AI-model-evaluation spatial-reasoning large-language-models vision-language-models AI-benchmarking
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 8 / 25
Community 7 / 25

How are scores calculated?

Stars

59

Forks

3

Language

Python

License

Last pushed

Jan 23, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/jiayuww/SpatialEval"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.