Hemanthkumar2112/Reward-Modeling-RLHF-Finetune-and-RAG

Gemma2(9B), Llama3-8B-Finetune-and-RAG, code base for sample, implemented in Kaggle platform

39
/ 100
Emerging

This project helps AI practitioners and researchers improve the quality and relevance of large language model outputs. By collecting human preferences on different model responses, you can train a 'reward model' that guides the language model to generate text that better aligns with human expectations. This allows for fine-tuning models like Llama3 8B or Gemma2 9B to produce more desirable and contextually accurate results for various applications.

No commits in the last 6 months.

Use this if you need to fine-tune existing large language models to produce highly relevant, human-aligned, and contextually rich text outputs for specific tasks or domains.

Not ideal if you are looking for a plug-and-play solution without any technical knowledge of machine learning or data collection for model training.

AI-research NLP-development language-model-fine-tuning generative-AI content-generation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

22

Forks

8

Language

Jupyter Notebook

License

Apache-2.0

Last pushed

Feb 08, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/Hemanthkumar2112/Reward-Modeling-RLHF-Finetune-and-RAG"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.