atfortes/LLMSymbolicReasoningBench

Synthetic data generation for evaluating LLM symbolic and logic reasoning

/ 100

Experimental

This project helps AI researchers and machine learning engineers create specialized training and evaluation data for large language models. It takes descriptions of symbolic reasoning tasks (like logic puzzles or specific linguistic patterns) and generates synthetic datasets. The output is custom-tailored data that helps evaluate how well an LLM handles complex reasoning challenges.

Use this if you need to generate unique, synthetic datasets to rigorously test and improve the symbolic and logic reasoning capabilities of your large language models.

Not ideal if you're looking for pre-existing, public datasets, or if your primary focus is on fine-tuning LLMs for tasks that don't heavily involve symbolic or logical reasoning.

LLM evaluation synthetic data generation AI research natural language processing reasoning benchmarks

No License No Package No Dependents

Maintenance 10 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 4 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

open-thought/reasoning-gym

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Hmbown/Hegelion

Dialectical reasoning architecture for LLMs (Thesis → Antithesis → Synthesis)

LLM360/Reasoning360

A repo for open research on building large reasoning models

bowang-lab/BioReason

BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model | NeurIPS '25

TsinghuaC3I/Awesome-RL-for-LRMs

A Survey of Reinforcement Learning for Large Reasoning Models

Explore LLM Tools

All categories Trending LLM Tool directory Insights