LLM Reasoning Research LLM Tools

Research frameworks, benchmarks, and methods for evaluating and advancing LLM reasoning capabilities across domains. Includes reasoning architectures, inference-time scaling, RL-based reasoning training, and reasoning-specific datasets. Does NOT include general LLM fine-tuning, application tools that use reasoning, or non-LLM reasoning systems.

There are 72 llm reasoning research tools tracked. 1 score above 70 (verified tier). The highest-rated is open-thought/reasoning-gym at 70/100 with 1,367 stars. 1 of the top 10 are actively maintained.

Get all 72 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-reasoning-research&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	open-thought/reasoning-gym [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning...	70	Verified	1,367	Python
2	Hmbown/Hegelion Dialectical reasoning architecture for LLMs (Thesis → Antithesis → Synthesis)	54	Established	137	Python
3	LLM360/Reasoning360 A repo for open research on building large reasoning models	51	Established	140	Python
4	TsinghuaC3I/Awesome-RL-for-LRMs A Survey of Reinforcement Learning for Large Reasoning Models	50	Established	2,368	TeX
5	bowang-lab/BioReason BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM...	50	Established	374	Jupyter Notebook
6	Peiyang-Song/Awesome-LLM-Reasoning-Failures Repo for "Large Language Model Reasoning Failures"	47	Emerging	165	—
7	ZichengXu/Decoding-Tree-Sketching Decoding Tree Sketching (DTS): a training-free & model agonistic & plug-in...	46	Emerging	67	Python
8	princeton-nlp/tree-of-thought-llm [NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large...	46	Emerging	5,873	Python
9	manglu097/Thoth [ICLR 2026] Unleashing Scientific Reasoning for Bio-experimental Protocol...	45	Emerging	65	Python
10	jieyilong/tree-of-thought-puzzle-solver The Tree of Thoughts (ToT) framework for solving complex reasoning tasks using LLMs	45	Emerging	371	Python
11	Agent-RL/ReCall ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning...	44	Emerging	1,343	Python
12	WeiboAI/VibeThinker Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model...	44	Emerging	575	Python
13	PRIME-RL/PRIME Scalable RL solution for advanced reasoning of language models	43	Emerging	1,813	Python
14	mohammad-gh009/DrugReasoner Predicting drug approval with reasoning.	43	Emerging	11	Python
15	sileod/reasoning-core Procedural symbolic reasoning data generators suite for synthetic pretraining	42	Emerging	35	Python
16	MiniMax-AI/SynLogic [NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable...	42	Emerging	198	Python
17	PPPP-kaqiu/Awesome-Parallel-Reasoning Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. ...	42	Emerging	49	HTML
18	Strong-AI-Lab/Logical-and-abstract-reasoning Evaluation on Logical Reasoning and Abstract Reasoning Challenges	41	Emerging	29	Python
19	sileod/reasoning_core Procedural symbolic reasoning data generators suite for synthetic pretraining	39	Emerging	34	Python
20	diagram-of-thought/diagram-of-thought Official implementation of paper "On the Diagram of Thought"...	39	Emerging	193	—
21	The-Martyr/Awesome-Multimodal-Reasoning Latest Advances on (RL based) Multimodal Reasoning and Generation in...	39	Emerging	48	—
22	TiMEM-AI/timem Temporal-Hierarchical Memory Consolidation for Long-Horizon Conversational Agents	39	Emerging	82	Python
23	Wang-ML-Lab/TokUR [ICLR 2026] TokUR: Token-Level Uncertainty Estimation for Large Language...	39	Emerging	4	Python
24	amazon-science/TISER [ACL 2025] Learning to Reason Over Time: Timeline Self-Reflection for...	37	Emerging	9	—
25	madaan/llm-reasoning-tutorial Resources for few-shot reasoning tutorial	36	Emerging	15	Jupyter Notebook
26	satori-reasoning/Satori [ICML 2025] Satori: Reinforcement Learning with Chain-of-Action-Thought...	36	Emerging	109	Python
27	intuit-ai-research/SPUQ SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models	36	Emerging	15	Python
28	geon0325/TimeCAP Source code for the AAAI 2025 paper "TimeCAP: Learning to Contextualize,...	35	Emerging	50	Python
29	luban-agi/Awesome-LLM-reasoning A curated paper list on LLM reasoning.	33	Emerging	90	—
30	Alsace08/Meta-Reasoning Code and Data Repo for ACL'24 Paper "Meta-Reasoning: Semantics-Symbol...	33	Emerging	7	—
31	Yinghao-Li/Minesweeper-for-LLM Code for paper: Assessing Logical Puzzle Solving in Large Language Models:...	32	Emerging	5	Python
32	LAMDASZ-ML/Awesome-LLM-Reasoning-with-NeSy ✨✨Latest Advances on Neuro-Symbolic Learning in the era of Large Language Models	32	Emerging	274	—
33	Osilly/Awesome-Interleaving-Reasoning Interleaving Reasoning: Next-Generation Reasoning Systems for AGI	32	Emerging	260	—
34	klietus/SignalZero Local first symbolic reasoning stack for large language models. Inference...	31	Emerging	3	Python
35	sail-sg/CLoT CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box:...	30	Emerging	322	Python
36	multimodal-art-projection/LatentCoT-Horizon 📖 This is a repository for organizing papers, codes, and other resources...	30	Emerging	367	—
37	Taishi-N324/Awesome-RL-Reasoning Awesome-RL-Reasoning	30	Emerging	14	—
38	fblgit/tree-of-knowledge ToK aka Tree of Knowledge for Large Language Models LLM. It's a novel...	29	Experimental	56	—
39	krystalan/DRT Deep Reasoning Translation (DRT) Project	29	Experimental	240	—
40	atfortes/LLMSymbolicReasoningBench Synthetic data generation for evaluating LLM symbolic and logic reasoning	28	Experimental	22	Python
41	BDML-lab/llm-inductive-reasoning-survey This is the repository for the paper ‘A Survey of Inductive Reasoning for...	28	Experimental	46	—
42	Mihir3009/GridPuzzle An evaluation dataset comprising of 274 grid-based puzzles with different...	28	Experimental	8	—
43	zhiyuanhubj/UoT [NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances...	28	Experimental	106	Python
44	141forever/inductive-reasoning-papers The Paper Collection of Inductive Reasoning from 2015 to 2025	27	Experimental	22	—
45	Ruiyang-061X/Uncertainty-o ✨ Official code for our paper: "Uncertainty-o: One Model-agnostic Framework...	26	Experimental	18	Python
46	OSU-NLP-Group/cobalt Code and data for the paper "Bridging Online and Offline RL: Contextual...	26	Experimental	9	Python
47	kang-ml/LogicTree [EMNLP 2025 Main] LogicTree: Structured Proof Exploration for Coherent and...	25	Experimental	6	Python
48	PurCL/ProSec Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"	24	Experimental	17	Python
49	plusnli/MITS [PAKDD 2026 oral] MITS: Enhanced Tree Search Reasoning for LLMs via...	24	Experimental	3	Python
50	Pomilon-Intelligence-Lab/CRSM CRSM (Continuous Reasoning State Model): An asynchronous "System 2"...	24	Experimental	1	Python
51	OSU-NLP-Group/llm-planning-eval [ACL'24] Code and data of paper "When is Tree Search Useful for LLM...	23	Experimental	54	Python
52	JIA-Lab-research/MoTCoder This is the official code repository of MoTCoder: Elevating Large Language...	23	Experimental	85	Python
53	THUNLP-MT/symbol2language Speak It Out: Solving Symbol-Related Problems with Symbol-to-Language...	22	Experimental	6	—
54	Pro-GenAI/S3Q-Reasoning Scratchpad 3Q Reasoning: Improving Truthfulness and Reducing Hallucination...	22	Experimental	4	Python
55	jxhuang0508/Awesome-LLM-Reasoning-OpenAI-o1 Awesome LLM papers, news and projects about learning to reason with LLM,...	22	Experimental	27	—
56	naivoder/MCTSr Monte Carlo Tree Search Self-Refine (MCTSr)	22	Experimental	22	Python
57	Letian2003/C-VQA Counterfactual Reasoning VQA Dataset	22	Experimental	28	Python
58	Simula-COMPLEX/tbrullm Technical Briefing on LLM-Assisted Uncertainty Analysis	21	Experimental	—	HTML
59	VITA-Group/o1-planning [NeurIPS'24 LanGame workshop] On The Planning Abilities of OpenAI's o1...	21	Experimental	42	Python
60	zzcnewly/ContPhy-Gen Codebase and tutorial of ContPhy dataset generation for ICML 2024 paper...	21	Experimental	10	C#
61	sylvain-wei/24-Game-Reasoning 超简单复现Deepseek-R1-Zero和Deepseek-R1，以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL，以激发LLM的自主验...	21	Experimental	34	Python
62	prodesk98/SQL-LLM-Distillation-GRPO Inspired by mathematical reasoning models like DeepSeekMath, this framework...	21	Experimental	6	Jupyter Notebook
63	SagnikMukherjee/PARC Premise-Augmented Reasoning Chains Improve Error Identification in Math...	21	Experimental	5	Python
64	mukhal/GRACE [EMNLP '23] Discriminator-Guided Chain-of-Thought Reasoning	21	Experimental	50	Python
65	Skrapma4872/S3Q-Reasoning 📝 Enhance large language model outputs by revealing assumptions with a...	21	Experimental	—	Python
66	ihasq/OpenReasoning Turn Ultralight Bogo Model Into SOTA Reasoning Expert	17	Experimental	—	—
67	DolbyUUU/Logic-RL-Lite Lightweight replication study of DeepSeek-R1-Zero. Interesting findings...	16	Experimental	50	Python
68	Xnhyacinth/Awesome-Latent-Reasoning 🔥 Must-read papers for LLM-based Latent Reasoning	13	Experimental	9	—
69	jpordoy/-Dynamic-Multi-Chain-Multi-Path-Reasoning-with-Consensus Multi-path reasoning with dynamic chains and consensus scoring for improved...	12	Experimental	1	Jupyter Notebook
70	John-Wendell/Long-CoT-data-for-LLM-to-solve-24-puzzle It is a dataset for fine-tuning LLMs to solve 24(puzzle)	12	Experimental	7	Python
71	OpenMOSS/Ultra-Innerthought Ultra-Innerthought is a bilingual (Chinese and English) open-domain R1/o1...	11	Experimental	3	—
72	NLPForUA/DUMY Ukrainian Multidomain Reasoning Dataset	10	Experimental	1	—