voidful/TextRL

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

52
/ 100
Established

This library helps AI developers improve the quality of text generated by large language models like GPT, T5, or BLOOM. By taking an existing pre-trained text generation model and a reward function (which defines what 'good' output means), it fine-tunes the model to produce text that better meets specific criteria. It's for machine learning engineers or researchers who want to customize large language models for particular use cases.

564 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to fine-tune a Hugging Face text generation model to produce more desirable outputs based on a custom reward signal, making its generated text more aligned with specific goals (e.g., more positive sentiment, factual correctness, or a particular style).

Not ideal if you are not familiar with machine learning concepts like reinforcement learning, or if you simply need to use an off-the-shelf text generation model without advanced customization.

large-language-models natural-language-generation model-fine-tuning reinforcement-learning-for-nlp generative-ai
Stale 6m
Maintenance 0 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 17 / 25

How are scores calculated?

Stars

564

Forks

61

Language

Python

License

MIT

Last pushed

May 09, 2024

Commits (30d)

0

Dependencies

2

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/voidful/TextRL"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.