CJReinforce/PURE

Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"

32
/ 100
Emerging

This project helps AI researchers and machine learning engineers fine-tune large language models (LLMs) to improve their reasoning abilities, particularly for complex mathematical problems. It takes an existing LLM and a dataset of mathematical prompts with process rewards, then outputs a more accurate and efficient LLM for solving reasoning tasks. The end-user is typically an expert working on advanced AI model development.

160 stars.

Use this if you are developing highly capable LLMs for reasoning tasks and need to efficiently fine-tune them using process-supervised or verifiable rewards to achieve state-of-the-art accuracy with fewer resources.

Not ideal if you are looking for a pre-trained, off-the-shelf LLM or if your primary goal is not to advance reasoning capabilities through novel fine-tuning techniques.

AI model development LLM fine-tuning reasoning AI machine learning research natural language processing
No License No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 8 / 25

How are scores calculated?

Stars

160

Forks

7

Language

Python

License

Last pushed

Oct 23, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/CJReinforce/PURE"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.