sail-sg/understand-r1-zero

Understanding R1-Zero-Like Training: A Critical Perspective

/ 100

Emerging

This project offers a deep dive into optimizing the training of large language models for mathematical reasoning tasks. It takes base models, like those from Qwen or Llama, and applies reinforcement learning techniques to significantly improve their ability to solve complex math problems. Scientists and machine learning researchers working on advanced AI for mathematical problem-solving would find this useful for developing more capable models.

1,224 stars. No commits in the last 6 months.

Use this if you are a machine learning researcher focused on improving the mathematical reasoning capabilities of large language models through advanced training techniques like reinforcement learning.

Not ideal if you are a general user looking for a pre-trained model for everyday math problems or if you lack a deep understanding of LLM training and reinforcement learning concepts.

mathematical-reasoning large-language-models reinforcement-learning-for-llms ai-model-training computational-mathematics-ai

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

1,224

Forks

Language

Python

License

MIT

Higher-rated alternatives

cvs-health/uqlm

UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM...

PRIME-RL/TTRL

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

sapientinc/HRM

Hierarchical Reasoning Model Official Release

tigerchen52/query_level_uncertainty

query-level uncertainty in LLMs

reasoning-survey/Awesome-Reasoning-Foundation-Models

✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models

Explore Transformer Models

All categories Trending Transformer directory Insights