FunnySaltyFish/Better-Ruozhiba

【逐条处理完成】人为审核+修改每一条的弱智吧精选问题QA数据集

/ 100

Emerging

This dataset provides a collection of humorous, quirky questions and their corresponding high-quality, 'serious' answers, curated specifically for training large language models in Chinese. It offers human-reviewed and refined question-answer pairs, improving upon an original dataset where some answers were GPT-4 generated. Anyone building or training Chinese language models will find this useful for enhancing their model's ability to handle diverse and nuanced conversational prompts.

253 stars.

Use this if you are a researcher or developer training a large language model and need a unique, human-curated Chinese question-answer dataset for improved model performance.

Not ideal if you are looking for a dataset of standard, factual question-answer pairs or if your language model is not focused on Chinese language understanding and generation.

AI-training-data Chinese-NLP conversational-AI language-model-development text-generation-training

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 9 / 25

How are scores calculated?

Stars

253

Forks

Language

—

License

Apache-2.0

Higher-rated alternatives

aalok-sathe/surprisal

A unified interface for computing surprisal (log probabilities) from language models! Supports...

EvolvingLMMs-Lab/lmms-engine

A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.

reasoning-machines/pal

PaL: Program-Aided Language Models (ICML 2023)

microsoft/monitors4codegen

Code and Data artifact for NeurIPS 2023 paper - "Monitor-Guided Decoding of Code LMs with Static...

FreedomIntelligence/EchoX

EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs

Explore LLM Tools

All categories Trending LLM Tool directory Insights