carbonz0/alpaca-chinese-dataset

alpaca中文指令微调数据集

/ 100

Emerging

This project provides a dataset for anyone looking to train or fine-tune AI models to better understand and generate responses in Chinese. It takes common instructions and examples, translated or generated, and produces a structured collection of Chinese instruction-response pairs. This is ideal for AI researchers, language model developers, or data scientists working on Chinese natural language processing applications.

397 stars. No commits in the last 6 months.

Use this if you need a high-quality, pre-structured dataset of Chinese instructions and corresponding outputs to teach an AI model how to follow commands in Chinese.

Not ideal if you are looking for a dataset of general Chinese text for tasks like sentiment analysis or machine translation without an instruction-following component.

AI model training Chinese NLP Instruction tuning Language model development Data science

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 13 / 25

How are scores calculated?

Stars

397

Forks

Language

—

License

—

Higher-rated alternatives

axolotl-ai-cloud/axolotl

Go ahead and axolotl questions

google/paxml

Pax is a Jax-based machine learning framework for training large scale models. Pax allows for...

JosefAlbers/PVM

Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon

iamarunbrahma/finetuned-qlora-falcon7b-medical

Finetuning of Falcon-7B LLM using QLoRA on Mental Health Conversational Dataset

h2oai/h2o-wizardlm

Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning

Explore LLM Tools

All categories Trending LLM Tool directory Insights