bobxwu/learning-from-rewards-llm-papers

A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward models and learning strategies across training, inference, and post-inference stages.

/ 100

Emerging

This is a curated collection of research papers focused on how Large Language Models (LLMs) learn from rewards during and after their initial training. It organizes various methods for using 'reward models' and different learning strategies to improve LLM performance across different stages of development and use. Researchers and engineers working on fine-tuning, evaluating, or deploying LLMs will find this useful for understanding state-of-the-art techniques.

No commits in the last 6 months.

Use this if you are developing or researching large language models and need to explore methods for improving their alignment, reasoning, or code generation through reward-based learning.

Not ideal if you are a general user looking for pre-trained LLMs or a basic introduction to how LLMs work, as this resource is highly technical and specific to advanced LLM development.

LLM development AI alignment reinforcement learning natural language processing machine learning research

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 8 / 25

Maturity 15 / 25

Community 5 / 25

How are scores calculated?

Stars

Forks

Language

—

License

MIT

Higher-rated alternatives

ExtensityAI/symbolicai

A neurosymbolic perspective on LLMs

TIGER-AI-Lab/MMLU-Pro

The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding...

deep-symbolic-mathematics/LLM-SR

[ICLR 2025 Oral] This is the official repo for the paper "LLM-SR" on Scientific Equation...

microsoft/interwhen

A framework for verifiable reasoning with language models.

zhudotexe/fanoutqa

Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language...

Explore Transformer Models

All categories Trending Transformer directory Insights