liushunyu/awesome-direct-preference-optimization

A Survey of Direct Preference Optimization (DPO)

/ 100

Experimental

This project offers a comprehensive survey of Direct Preference Optimization (DPO), a method for aligning large language models with human preferences. It provides a structured overview, categorizing existing DPO research based on aspects like data strategy and learning frameworks. AI researchers and practitioners focused on improving large language model alignment with human feedback would find this valuable.

No commits in the last 6 months.

Use this if you are a researcher or practitioner looking for a structured overview and categorization of current research in Direct Preference Optimization to better understand or apply DPO methods for large language models.

Not ideal if you are looking for a plug-and-play code library or a tool to directly fine-tune a language model without needing to understand the underlying research and methodologies.

Large Language Models AI Alignment Machine Learning Research Natural Language Processing AI Ethics

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 9 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

—

License

—

Higher-rated alternatives

codelion/pts

Pivotal Token Search

DtYXs/Pre-DPO

Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model

RLHFlow/Directional-Preference-Alignment

Directional Preference Alignment

dannylee1020/openpo

Building synthetic data for preference tuning

pspdada/Uni-DPO

[ICLR 2026] Official repository of "Uni-DPO: A Unified Paradigm for Dynamic Preference...

Explore LLM Tools

All categories Trending LLM Tool directory Insights