NVlabs/RLP

[ICLR 2026] Official PyTorch Implementation of RLP: Reinforcement as a Pretraining Objective

/ 100

Emerging

This project helps AI researchers and developers create large language models (LLMs) that can "think" more effectively before generating answers. By integrating a reinforcement learning objective during the model's initial training, it teaches the model to generate intermediate reasoning steps. This results in LLMs that produce more accurate and robust outputs for complex tasks, especially in areas like math and science.

241 stars.

Use this if you are pre-training large language models and want to instill strong reasoning capabilities and improved accuracy from the very beginning, without significantly increasing computational cost.

Not ideal if you are looking for a tool to fine-tune an already pre-trained model or if your primary goal is to optimize for speed over complex reasoning.

large-language-models ai-pretraining reasoning-models scientific-ai mathematical-ai

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 15 / 25

Community 11 / 25

How are scores calculated?

Stars

241

Forks

Language

—

License

—

Higher-rated alternatives

agentscope-ai/Trinity-RFT

Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement...

OpenRLHF/OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO &...

zjunlp/EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

huggingface/alignment-handbook

Robust recipes to align language models with human and AI preferences

hyunwoongko/nanoRLHF

nanoRLHF: from-scratch journey into how LLMs and RLHF really work.

Explore Transformer Models

All categories Trending Transformer directory Insights