FunnySaltyFish/Better-Ruozhiba
【逐条处理完成】人为审核+修改每一条的弱智吧精选问题QA数据集
This dataset provides a collection of humorous, quirky questions and their corresponding high-quality, 'serious' answers, curated specifically for training large language models in Chinese. It offers human-reviewed and refined question-answer pairs, improving upon an original dataset where some answers were GPT-4 generated. Anyone building or training Chinese language models will find this useful for enhancing their model's ability to handle diverse and nuanced conversational prompts.
253 stars.
Use this if you are a researcher or developer training a large language model and need a unique, human-curated Chinese question-answer dataset for improved model performance.
Not ideal if you are looking for a dataset of standard, factual question-answer pairs or if your language model is not focused on Chinese language understanding and generation.
Stars
253
Forks
11
Language
—
License
Apache-2.0
Category
Last pushed
Feb 21, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/FunnySaltyFish/Better-Ruozhiba"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
aalok-sathe/surprisal
A unified interface for computing surprisal (log probabilities) from language models! Supports...
EvolvingLMMs-Lab/lmms-engine
A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.
reasoning-machines/pal
PaL: Program-Aided Language Models (ICML 2023)
microsoft/monitors4codegen
Code and Data artifact for NeurIPS 2023 paper - "Monitor-Guided Decoding of Code LMs with Static...
FreedomIntelligence/EchoX
EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs