BatsResearch/planetarium

Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL

/ 100

Emerging

This is a tool for developers who are building or evaluating large language models (LLMs) that need to understand and generate planning problems. It takes natural language descriptions of tasks and converts them into a formal planning language called PDDL. The output is a dataset and a method to rigorously compare whether an LLM's generated PDDL correctly matches a ground truth PDDL description, without needing to run a planner. This project is for AI researchers and developers working on automated planning and LLM capabilities.

No commits in the last 6 months.

Use this if you are developing or benchmarking LLMs that translate natural language instructions into formal planning problem descriptions like PDDL.

Not ideal if you are a practitioner looking to simply generate plans for your real-world problems without developing or evaluating an LLM.

LLM development automated planning AI research natural language processing model evaluation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

BSD-3-Clause

Higher-rated alternatives

xrsrke/toolformer

Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools

MozerWang/AMPO

[ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents

real-stanford/reflect

[CoRL 2023] REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction

nsidn98/LLaMAR

Code for our paper LLaMAR: LM-based Long-Horizon Planner for Multi-Agent Robotics

WayneMao/RoboMatrix

The Official Implementation of RoboMatrix

Explore Transformer Models

All categories Trending Transformer directory Insights