Peiyang-Song/Awesome-LLM-Reasoning-Failures

Repo for "Large Language Model Reasoning Failures"

/ 100

Emerging

This project compiles a detailed list of academic papers focusing on why Large Language Models (LLMs) make errors in reasoning, what causes these failures, and how they can be fixed. It organizes research on LLM limitations across various reasoning types, from everyday social interactions to complex logic and math. Anyone developing, evaluating, or deploying AI applications that rely on LLM outputs would use this to understand and address common pitfalls.

165 stars.

Use this if you are a researcher, AI product manager, or ML engineer who needs to understand the current state of LLM reasoning capabilities and their documented failure modes, guiding more robust AI development.

Not ideal if you are looking for a practical guide or a tool for immediate, hands-on debugging of an LLM in a production environment, as it primarily curates academic literature.

AI evaluation LLM research AI limitations AI safety cognitive biases in AI

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 15 / 25

Community 12 / 25

How are scores calculated?

Stars

165

Forks

Language

—

License

MIT

Higher-rated alternatives

open-thought/reasoning-gym

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Hmbown/Hegelion

Dialectical reasoning architecture for LLMs (Thesis → Antithesis → Synthesis)

LLM360/Reasoning360

A repo for open research on building large reasoning models

TsinghuaC3I/Awesome-RL-for-LRMs

A Survey of Reinforcement Learning for Large Reasoning Models

bowang-lab/BioReason

BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model | NeurIPS '25

Explore LLM Tools

All categories Trending LLM Tool directory Insights