RLHF Alignment Training LLM Tools

Tools and implementations for Reinforcement Learning from Human Feedback (RLHF), including reward modeling, policy optimization, and techniques for aligning LLMs with human preferences. Does NOT include general fine-tuning, inference optimization, or non-RLHF alignment methods.

There are 27 rlhf alignment training tools tracked. 5 score above 50 (established tier). The highest-rated is hud-evals/hud-python at 65/100 with 316 stars. 1 of the top 10 are actively maintained.

Get all 27 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=rlhf-alignment-training&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	hud-evals/hud-python OSS RL environment + evals toolkit	65	Established	316	Python
2	hiyouga/EasyR1 EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL	58	Established	4,721	Python
3	OpenRL-Lab/openrl Unified Reinforcement Learning Framework	53	Established	822	Python
4	sail-sg/oat 🌾 OAT: A research-friendly framework for LLM online alignment, including...	53	Established	638	Python
5	opendilab/awesome-RLHF A curated list of reinforcement learning with human feedback resources...	50	Established	4,325	—
6	NVlabs/GDPO Official implementation of GDPO: Group reward-Decoupled Normalization Policy...	46	Emerging	413	Python
7	xrsrke/instructGOOSE Implementation of Reinforcement Learning from Human Feedback (RLHF)	41	Emerging	174	Jupyter Notebook
8	BaohaoLiao/SAGE Self-Hinting Language Models Enhance Reinforcement Learning	40	Emerging	24	Python
9	haoliuhl/chain-of-hindsight Simple next-token-prediction for RLHF	39	Emerging	229	Python
10	NJUNLP/GRRM A novel Group Relative Reward Model (GRRM) framework enhances machine...	37	Emerging	5	Python
11	arunprsh/ChatGPT-Decoded-GPT2-FAQ-Bot-RLHF-PPO A Practical Guide to Developing a Reliable FAQ Chatbot with Reinforcement...	36	Emerging	14	Jupyter Notebook
12	SagnikMukherjee/sparsity_in_rl Reinforcement Learning Finetunes Small Subnetworks in Large Language Models	33	Emerging	12	Python
13	Jayluci4/micro-rlhf RLHF in ~150 lines - understand how ChatGPT is aligned by building from scratch	32	Emerging	1	Python
14	WisdomShell/RewardAnything RewardAnything: Generalizable Principle-Following Reward Models	30	Emerging	45	Python
15	rosinality/meshfn Framework for Human Alignment Learning	29	Experimental	7	Python
16	Zh1yuShen/MemBuilder Code of "MemBuilder: Reinforcing LLMs for Long-Term Memory Construction via...	28	Experimental	7	Python
17	zafstojano/policy-gradients A minimal hackable implementation of policy gradient methods (GRPO, PPO, REINFORCE)	28	Experimental	13	Python
18	hc495/StaICC A standardized toolkit for classification task on In-context Learning....	27	Experimental	2	Python
19	ALucek/rl-for-llms Context & Guide For Reinforcement Learning with Verifiable Rewards with...	27	Experimental	12	Jupyter Notebook
20	AlignGPT-VL/AlignGPT Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive...	27	Experimental	34	Python
21	GAIR-NLP/ReAlign Reformatted Alignment	26	Experimental	111	JavaScript
22	hggzjx/RewardAuditor Official Repo for Paper: "Reward Auditor: Inference on Reward Modeling...	26	Experimental	31	Python
23	psunlpgroup/FoVer This repository includes code and materials for the paper "Generalizable...	22	Experimental	11	Python
24	nielsyA/Tree-GRPO 🌳 Enhance LLM agent performance with Tree-GRPO, leveraging tree search...	22	Experimental	—	Python
25	safouaneelg/SRT2I Class-Conditional self-reward mechanism for improved Text-to-Image models	20	Experimental	7	Jupyter Notebook
26	lafmdp/RLC [ICLR'24] Official code for "Language Model Self-improvement by...	14	Experimental	7	Jupyter Notebook
27	ikun-llm/ikun-GRPO 强化学习对齐 \| Group Relative Policy Optimization 🎮	14	Experimental	—	—