voidful/TextRL
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
This library helps AI developers improve the quality of text generated by large language models like GPT, T5, or BLOOM. By taking an existing pre-trained text generation model and a reward function (which defines what 'good' output means), it fine-tunes the model to produce text that better meets specific criteria. It's for machine learning engineers or researchers who want to customize large language models for particular use cases.
564 stars. No commits in the last 6 months. Available on PyPI.
Use this if you need to fine-tune a Hugging Face text generation model to produce more desirable outputs based on a custom reward signal, making its generated text more aligned with specific goals (e.g., more positive sentiment, factual correctness, or a particular style).
Not ideal if you are not familiar with machine learning concepts like reinforcement learning, or if you simply need to use an off-the-shelf text generation model without advanced customization.
Stars
564
Forks
61
Language
Python
License
MIT
Category
Last pushed
May 09, 2024
Commits (30d)
0
Dependencies
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/voidful/TextRL"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
openai/openai-cookbook
Examples and guides for using the OpenAI API
rgbkrk/dangermode
Execute IPython & Jupyter from the comforts of chat.openai.com
CogStack/OpenGPT
A framework for creating grounded instruction based datasets and training conversational domain...
Declipsonator/GPTZzzs
Large language model detection evasion through grammar and vocabulary modifcation.
antononcube/Python-JupyterChatbook
Python package of a Jupyter extension that facilitates the interaction with LLMs.