vicgalle/zero-shot-reward-models
ZYN: Zero-Shot Reward Models with Yes-No Questions
This project helps developers fine-tune large language models (LLMs) to generate specific types of text, like positive movie reviews or non-toxic content. You provide a prompt that asks a yes/no question about the desired text quality, and the system trains an LLM to produce outputs that consistently elicit a 'yes' answer. This is for machine learning engineers and researchers working on text generation applications.
No commits in the last 6 months.
Use this if you need to guide an instruction-tuned LLM to generate text that fulfills a specific, measurable quality or attribute without needing to collect extensive human preference data for a reward model.
Not ideal if you're not a developer working with large language models, or if your desired text quality is highly subjective and cannot be framed as a clear yes/no question.
Stars
35
Forks
8
Language
Python
License
MIT
Category
Last pushed
Aug 15, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/vicgalle/zero-shot-reward-models"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
liuqidong07/LLM-ESR
[NeurIPS'24 Spotlight] The official implementation code of LLM-ESR.
westlake-repl/IDvs.MoRec
End-to-end Training for Multimodal Recommendation Systems
amazon-science/AdaRec
Adaptive Generative Recommendations with Large Language Models
sichunluo/RecRanker
[TOIS'24] "RecRanker: Instruction Tuning Large Language Model as Ranker for Top-k Recommendation"
liuqidong07/LEADER-pytorch
[arXiv'24] The official implementation code of LEADER.