phy-q/benchmark

Phy-Q: A Testbed for Physical Reasoning

37
/ 100
Emerging

This project provides a benchmark to test how well AI agents understand and react to real-world physics, similar to how humans or robots do. It takes an AI agent as input and evaluates its ability to solve tasks in a simulated environment based on 15 physical scenarios (like rolling, falling, or structural stability). The output is a "Phy-Q score" that measures the agent's physical reasoning intelligence. This is for AI researchers and developers working on intelligent agents for robotics or other physical interaction systems.

No commits in the last 6 months.

Use this if you are developing or evaluating AI agents that need to reason about physical interactions and make decisions in dynamic environments.

Not ideal if you are looking for a general-purpose AI benchmark that doesn't focus specifically on physical reasoning in simulated environments.

robotics AI agent development physical simulation intelligent systems reasoning evaluation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 13 / 25

How are scores calculated?

Stars

45

Forks

6

Language

Python

License

MIT

Last pushed

Jul 29, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/phy-q/benchmark"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.