declare-lab/LLM-PuzzleTest

This repository is maintained to release dataset and models for multimodal puzzle reasoning.

36
/ 100
Emerging

This project provides datasets and tools for evaluating how well large multimodal AI models understand and solve visual puzzles, similar to those found in IQ tests. It takes an AI model and a puzzle image as input, then measures the model's ability to identify patterns and provide correct answers. This is useful for AI researchers and developers who are building or testing advanced AI models and want to measure their abstract reasoning capabilities.

113 stars. No commits in the last 6 months.

Use this if you are a researcher or developer focused on building and evaluating the reasoning capabilities of multimodal AI models, and you need standardized benchmarks for visual abstract pattern recognition.

Not ideal if you are looking for a general-purpose AI model to solve your everyday visual tasks or a tool for human puzzle-solving.

AI-model-evaluation multimodal-AI cognitive-AI AI-benchmarking abstract-reasoning
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 11 / 25

How are scores calculated?

Stars

113

Forks

8

Language

Python

License

MIT

Last pushed

Feb 26, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/declare-lab/LLM-PuzzleTest"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.