xrsrke/instructGOOSE
Implementation of Reinforcement Learning from Human Feedback (RLHF)
This tool helps machine learning engineers fine-tune large language models to better follow human instructions. By taking a pre-trained language model and feedback data from human preferences, it allows you to train the model to produce responses that are more aligned with desired human outputs. It's designed for ML practitioners who want to customize existing LLMs for specific tasks.
174 stars. No commits in the last 6 months.
Use this if you are a machine learning engineer looking to implement Reinforcement Learning from Human Feedback (RLHF) to align your language models with human preferences.
Not ideal if you are a non-developer or do not have experience with machine learning frameworks like PyTorch and Hugging Face Transformers.
Stars
174
Forks
21
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Apr 07, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/xrsrke/instructGOOSE"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hud-evals/hud-python
OSS RL environment + evals toolkit
hiyouga/EasyR1
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
OpenRL-Lab/openrl
Unified Reinforcement Learning Framework
sail-sg/oat
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning,...
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)