raunak-agarwal/instruction-datasets

Datasets for Instruction Tuning of Large Language Models

/ 100

Experimental

This is a curated collection of specialized datasets designed to fine-tune Large Language Models (LLMs) to follow instructions more accurately. It provides a wide array of textual and multimodal data, ranging from conversational exchanges to task-specific prompts in multiple languages, which go into training an LLM. The output is an LLM that is better at understanding and executing complex instructions. This resource is for AI researchers and machine learning engineers who are building and improving LLMs for various applications.

261 stars. No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher looking for high-quality, pre-processed datasets to train or fine-tune large language models to better understand and follow user instructions.

Not ideal if you are an end-user simply looking to use an existing language model or if you require datasets for traditional machine learning tasks outside of LLM instruction tuning.

LLM training NLP research AI model fine-tuning conversational AI machine learning engineering

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 10 / 25

How are scores calculated?

Stars

261

Forks

Language

—

License

—

Higher-rated alternatives

MantisAI/sieves

Plug-and-play document AI with zero-shot models.

xiaoya-li/Instruction-Tuning-Survey

Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`

TencentARC-QQ/TagGPT

TagGPT: Large Language Models are Zero-shot Multimodal Taggers

rafaelpierre/bullet

bullet: A Zero-Shot / Few-Shot Learning, LLM Based, text classification framework

amazon-science/adaptive-in-context-learning

AdaICL: Which Examples to Annotate of In-Context Learning? Towards Effective and Efficient Selection

Explore LLM Tools

All categories Trending LLM Tool directory Insights