jackaduma/Alpaca-LoRA-RLHF-PyTorch

A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca

35
/ 100
Emerging

This project offers a complete workflow to adapt an existing language model, like Alpaca, to your specific needs using smaller datasets and affordable hardware. You provide the base model and your own data, and it outputs a fine-tuned model that behaves more like a custom chatbot. This is for AI practitioners or researchers looking to personalize large language models without extensive computational resources.

No commits in the last 6 months.

Use this if you want to create a custom, instruction-following large language model from an Alpaca base, leveraging reinforcement learning with human feedback on consumer-grade GPUs.

Not ideal if you need a solution that runs out-of-the-box on very limited memory, as loading both the base and reward models can still exceed consumer hardware limits.

large-language-models model-fine-tuning conversational-ai ai-research natural-language-processing
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 11 / 25

How are scores calculated?

Stars

61

Forks

6

Language

Python

License

MIT

Last pushed

Apr 28, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/jackaduma/Alpaca-LoRA-RLHF-PyTorch"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.