allenai/RL4LMs
A modular RL library to fine-tune language models to human preferences
This project helps natural language processing (NLP) practitioners customize large language models for specific tasks. You can take an existing language model and fine-tune it using various reward functions to generate text that aligns with specific human preferences or metrics. This tool is for NLP researchers, data scientists, or machine learning engineers who need to optimize text generation for tasks like summarization, translation, or dialogue.
2,382 stars. No commits in the last 6 months.
Use this if you need to fine-tune transformer-based language models to produce text that scores highly on specific, measurable criteria for tasks like summarization or question answering.
Not ideal if you are looking for a pre-trained, off-the-shelf language model solution without needing custom fine-tuning or advanced reinforcement learning techniques.
Stars
2,382
Forks
202
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 01, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/allenai/RL4LMs"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
emredeveloper/Mem-LLM
Mem-LLM is a Python library for building memory-enabled AI assistants that run entirely on local...
cloudguruab/modsysML
Human reinforcement learning (RLHF) framework for AI models. Evaluate and compare LLM outputs,...
ManasVardhan/bench-my-llm
🏎️ Dead-simple LLM benchmarking CLI - latency, cost, and quality metrics
modal-labs/stopwatch
A tool for benchmarking LLMs on Modal
Mya-Mya/CBF-LLM
"CBF-LLM: Safe Control for LLM Alignment"