carbonz0/alpaca-chinese-dataset

alpaca中文指令微调数据集

31
/ 100
Emerging

This project provides a dataset for anyone looking to train or fine-tune AI models to better understand and generate responses in Chinese. It takes common instructions and examples, translated or generated, and produces a structured collection of Chinese instruction-response pairs. This is ideal for AI researchers, language model developers, or data scientists working on Chinese natural language processing applications.

397 stars. No commits in the last 6 months.

Use this if you need a high-quality, pre-structured dataset of Chinese instructions and corresponding outputs to teach an AI model how to follow commands in Chinese.

Not ideal if you are looking for a dataset of general Chinese text for tasks like sentiment analysis or machine translation without an instruction-following component.

AI model training Chinese NLP Instruction tuning Language model development Data science
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 13 / 25

How are scores calculated?

Stars

397

Forks

24

Language

License

Category

llm-fine-tuning

Last pushed

Mar 26, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/carbonz0/alpaca-chinese-dataset"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.