Instruction Tuning Datasets LLM Tools

Datasets, papers, and resources specifically for instruction tuning and instruction-following in LLMs. Does NOT include general fine-tuning methods, evaluation benchmarks, or model inference tools.

There are 19 instruction tuning datasets tools tracked. 1 score above 50 (established tier). The highest-rated is MantisAI/sieves at 55/100 with 125 stars.

Get all 19 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=instruction-tuning-datasets&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 MantisAI/sieves

Plug-and-play document AI with zero-shot models.

55
Established
2 xiaoya-li/Instruction-Tuning-Survey

Project for the paper entitled `Instruction Tuning for Large Language...

44
Emerging
3 TencentARC-QQ/TagGPT

TagGPT: Large Language Models are Zero-shot Multimodal Taggers

35
Emerging
4 rafaelpierre/bullet

bullet: A Zero-Shot / Few-Shot Learning, LLM Based, text classification framework

35
Emerging
5 amazon-science/adaptive-in-context-learning

AdaICL: Which Examples to Annotate of In-Context Learning? Towards Effective...

34
Emerging
6 andrewzamai/SLIMER_IT

An Instruction-tuned LLM for zero-shot NER on Italian

33
Emerging
7 princeton-pli/STAT

Skill-Targeted Adaptive Training

32
Emerging
8 LIN-SHANG/InstructERC

The offical realization of InstructERC

31
Emerging
9 OpenGVLab/Instruct2Act

Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with...

30
Emerging
10 Lichang-Chen/InstructZero

Official Implementation of InstructZero; the first framework to optimize bad...

30
Emerging
11 raunak-agarwal/instruction-datasets

Datasets for Instruction Tuning of Large Language Models

28
Experimental
12 basicv8vc/chinese-instruction-datasets-for-llms

用于微调LLM的中文指令数据集

27
Experimental
13 snowood1/Zero-Shot-PLOVER

Leveraging Codebook Knowledge with NLI and ChatGPT for Zero-Shot Political...

25
Experimental
14 A-baoYang/instruction-finetune-datasets

Collect and maintain high quality instruction finetune datasets in different...

22
Experimental
15 Reason-Wang/notable-instruction-llm

The repo collects model and data projects for instruction following large...

21
Experimental
16 Showndarya/Few-Shot-ChatGPT

Zero-Shot and Few-shot learning method using ChatGPT on problem sets

20
Experimental
17 orionw/FollowIR

FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions

16
Experimental
18 DeperiasKerre/qpInstruct

Instruction Dataset for QCL properties Extraction from Text

13
Experimental
19 davor10105/laat

Use LLMs as training regularizers for small, differentiable models and...

11
Experimental