RLHFlow/Directional-Preference-Alignment

Directional Preference Alignment

/ 100

Emerging

This project helps you fine-tune large language models (LLMs) to generate responses that match specific user preferences, like being more helpful or more verbose. You input a user prompt and specify the desired 'mix' of attributes (e.g., 70% helpfulness, 30% verbosity), and the model outputs a text response tailored to those instructions. This is ideal for content creators, customer service managers, or anyone needing precise control over an LLM's output style and content.

No commits in the last 6 months.

Use this if you need to precisely control the stylistic and content attributes of an LLM's generated text, beyond simple prompts.

Not ideal if you're looking for a general-purpose LLM without needing fine-grained control over specific output characteristics.

AI-content-generation customer-service-automation LLM-customization text-generation content-moderation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

—

License

Apache-2.0

Higher-rated alternatives

codelion/pts

Pivotal Token Search

DtYXs/Pre-DPO

Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model

dannylee1020/openpo

Building synthetic data for preference tuning

pspdada/Uni-DPO

[ICLR 2026] Official repository of "Uni-DPO: A Unified Paradigm for Dynamic Preference...

liushunyu/awesome-direct-preference-optimization

A Survey of Direct Preference Optimization (DPO)

Explore LLM Tools

All categories Trending LLM Tool directory Insights