BatsResearch/planetarium

Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL

35
/ 100
Emerging

This is a tool for developers who are building or evaluating large language models (LLMs) that need to understand and generate planning problems. It takes natural language descriptions of tasks and converts them into a formal planning language called PDDL. The output is a dataset and a method to rigorously compare whether an LLM's generated PDDL correctly matches a ground truth PDDL description, without needing to run a planner. This project is for AI researchers and developers working on automated planning and LLM capabilities.

No commits in the last 6 months.

Use this if you are developing or benchmarking LLMs that translate natural language instructions into formal planning problem descriptions like PDDL.

Not ideal if you are a practitioner looking to simply generate plans for your real-world problems without developing or evaluating an LLM.

LLM development automated planning AI research natural language processing model evaluation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 11 / 25

How are scores calculated?

Stars

65

Forks

6

Language

Python

License

BSD-3-Clause

Last pushed

Oct 16, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/BatsResearch/planetarium"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.