h2oai/h2o-wizardlm

Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning

40
/ 100
Emerging

This tool helps developers and researchers fine-tune large language models (LLMs) by automatically generating complex question-and-answer pairs. You provide an existing instruction-tuned LLM and optionally some initial prompts or document corpus, and it outputs a rich dataset of instruction prompts and their corresponding responses. This is ideal for those building open-source LLMs without proprietary data dependencies.

308 stars. No commits in the last 6 months.

Use this if you need to create diverse, high-complexity instruction datasets to further train your large language models, especially when aiming for Apache 2.0 licensed models and data.

Not ideal if you require extremely fast dataset generation or if your current instruction-tuned LLM isn't reasonably good, as the quality of the generated outputs depends on the input model.

LLM fine-tuning natural language processing AI model training dataset generation open-source AI
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 14 / 25

How are scores calculated?

Stars

308

Forks

26

Language

Python

License

Apache-2.0

Category

llm-fine-tuning

Last pushed

Oct 24, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/h2oai/h2o-wizardlm"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.