h2oai/h2o-wizardlm

Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning

/ 100

Emerging

This tool helps developers and researchers fine-tune large language models (LLMs) by automatically generating complex question-and-answer pairs. You provide an existing instruction-tuned LLM and optionally some initial prompts or document corpus, and it outputs a rich dataset of instruction prompts and their corresponding responses. This is ideal for those building open-source LLMs without proprietary data dependencies.

308 stars. No commits in the last 6 months.

Use this if you need to create diverse, high-complexity instruction datasets to further train your large language models, especially when aiming for Apache 2.0 licensed models and data.

Not ideal if you require extremely fast dataset generation or if your current instruction-tuned LLM isn't reasonably good, as the quality of the generated outputs depends on the input model.

LLM fine-tuning natural language processing AI model training dataset generation open-source AI

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

308

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

axolotl-ai-cloud/axolotl

Go ahead and axolotl questions

google/paxml

Pax is a Jax-based machine learning framework for training large scale models. Pax allows for...

JosefAlbers/PVM

Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon

iamarunbrahma/finetuned-qlora-falcon7b-medical

Finetuning of Falcon-7B LLM using QLoRA on Mental Health Conversational Dataset

WangRongsheng/Aurora

The official codes for "Aurora: Activating chinese chat capability for Mixtral-8x7B sparse...

Explore LLM Tools

All categories Trending LLM Tool directory Insights