declare-lab/LLM-PuzzleTest
This repository is maintained to release dataset and models for multimodal puzzle reasoning.
This project provides datasets and tools for evaluating how well large multimodal AI models understand and solve visual puzzles, similar to those found in IQ tests. It takes an AI model and a puzzle image as input, then measures the model's ability to identify patterns and provide correct answers. This is useful for AI researchers and developers who are building or testing advanced AI models and want to measure their abstract reasoning capabilities.
113 stars. No commits in the last 6 months.
Use this if you are a researcher or developer focused on building and evaluating the reasoning capabilities of multimodal AI models, and you need standardized benchmarks for visual abstract pattern recognition.
Not ideal if you are looking for a general-purpose AI model to solve your everyday visual tasks or a tool for human puzzle-solving.
Stars
113
Forks
8
Language
Python
License
MIT
Category
Last pushed
Feb 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/declare-lab/LLM-PuzzleTest"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ExtensityAI/symbolicai
A neurosymbolic perspective on LLMs
TIGER-AI-Lab/MMLU-Pro
The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding...
deep-symbolic-mathematics/LLM-SR
[ICLR 2025 Oral] This is the official repo for the paper "LLM-SR" on Scientific Equation...
microsoft/interwhen
A framework for verifiable reasoning with language models.
zhudotexe/fanoutqa
Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language...