h2oai/h2o-wizardlm
Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning
This tool helps developers and researchers fine-tune large language models (LLMs) by automatically generating complex question-and-answer pairs. You provide an existing instruction-tuned LLM and optionally some initial prompts or document corpus, and it outputs a rich dataset of instruction prompts and their corresponding responses. This is ideal for those building open-source LLMs without proprietary data dependencies.
308 stars. No commits in the last 6 months.
Use this if you need to create diverse, high-complexity instruction datasets to further train your large language models, especially when aiming for Apache 2.0 licensed models and data.
Not ideal if you require extremely fast dataset generation or if your current instruction-tuned LLM isn't reasonably good, as the quality of the generated outputs depends on the input model.
Stars
308
Forks
26
Language
Python
License
Apache-2.0
Category
Last pushed
Oct 24, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/h2oai/h2o-wizardlm"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
axolotl-ai-cloud/axolotl
Go ahead and axolotl questions
google/paxml
Pax is a Jax-based machine learning framework for training large scale models. Pax allows for...
JosefAlbers/PVM
Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon
iamarunbrahma/finetuned-qlora-falcon7b-medical
Finetuning of Falcon-7B LLM using QLoRA on Mental Health Conversational Dataset
WangRongsheng/Aurora
The official codes for "Aurora: Activating chinese chat capability for Mixtral-8x7B sparse...