DolbyUUU/Logic-RL-Lite

Lightweight replication study of DeepSeek-R1-Zero. Interesting findings include "No Aha Moment", "Longer CoT ≠ Accuracy", and "Language Mixing in Instruct Models".

/ 100

Experimental

This project investigates how to post-train large language models (LLMs) to improve their reasoning capabilities using only reinforcement learning, without initial supervised fine-tuning. It takes base LLMs and a dataset of logic puzzles to analyze how different models learn to reason and identify key factors like model size, base model choice, and training algorithms. This is for AI researchers or machine learning engineers studying LLM reasoning and training methodologies.

No commits in the last 6 months.

Use this if you are researching how pure reinforcement learning impacts the reasoning abilities of language models, especially regarding logical puzzles.

Not ideal if you are looking for a ready-to-use LLM for general-purpose reasoning tasks or a tool for applying LLMs to business problems.

AI Research Reinforcement Learning Large Language Models Model Training Logical Reasoning

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

open-thought/reasoning-gym

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Hmbown/Hegelion

Dialectical reasoning architecture for LLMs (Thesis → Antithesis → Synthesis)

LLM360/Reasoning360

A repo for open research on building large reasoning models

TsinghuaC3I/Awesome-RL-for-LRMs

A Survey of Reinforcement Learning for Large Reasoning Models

bowang-lab/BioReason

BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model | NeurIPS '25

Explore LLM Tools

All categories Trending LLM Tool directory Insights