yyDing1/ScaleQuest

[ACL 2025] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.

/ 100

Emerging

This project helps researchers and developers improve the reasoning abilities of Large Language Models (LLMs) without needing large, pre-existing datasets. It provides a method and models to generate new, challenging questions from scratch, which are then used to train and refine LLMs. The output is an LLM that is better at solving complex problems, particularly in mathematics and reasoning tasks. It is ideal for AI researchers and practitioners focused on enhancing LLM capabilities.

No commits in the last 6 months.

Use this if you need to create high-quality, diverse question-and-answer datasets to train or fine-tune an LLM, especially when starting with limited or no seed data for complex reasoning problems.

Not ideal if you already have extensive, high-quality question-and-answer datasets or are looking for a simple, off-the-shelf solution for common language generation tasks.

AI research LLM training mathematical reasoning data synthesis question generation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Related models

Peiyang-Song/LLM-A-Not-B-Errors

Official repository for paper "In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B...

yilin-geng/llm-instruction-conflicts

This repository contains the data and the code for the paper "Control Illusion: The Failure of...

valeria-izvoreanu/LLM-Hallucination-Detection-SemEval2024

Semi-supervised pipeline to detect LLM hallucinations. Uses Mistral-7B for zero-shot...

noanonkes/Hallucination-Detection-in-LLMs

Detecting Hallucinations in Large Language Model Generations using Graph Structures

Explore Transformer Models

All categories Trending Transformer directory Insights