asaparov/prontoqa
Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.
This project helps researchers and AI practitioners evaluate how well large language models (LLMs) reason and explain their answers. It generates specialized question-answering datasets where the inputs are simple sentences and the outputs include the correct answer along with a step-by-step reasoning process. You would use this if you are an AI researcher or a developer working on LLMs and need to rigorously test their deductive reasoning capabilities, especially on new, unseen examples.
156 stars. No commits in the last 6 months.
Use this if you need to create controlled datasets to formally analyze the 'chain-of-thought' explanations from large language models and understand their deductive reasoning.
Not ideal if you are looking for a general-purpose dataset for training or fine-tuning language models on a wide variety of real-world tasks, as this is designed for specific reasoning analysis.
Stars
156
Forks
16
Language
Python
License
Apache-2.0
Category
Last pushed
Sep 09, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/asaparov/prontoqa"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
InternScience/GraphGen
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
timothepearce/synda
A CLI for generating synthetic data
rasinmuhammed/misata
High-performance open-source synthetic data engine. Uses LLMs for schema design and vectorized...
ziegler-ingo/CRAFT
[TACL, EMNLP 2025 Oral] Code, datasets, and checkpoints for the paper "CRAFT Your Dataset:...
ZhuLinsen/FastDatasets
A powerful tool for creating high-quality training datasets for Large Language Models...