yyDing1/ScaleQuest

[ACL 2025] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.

35
/ 100
Emerging

This project helps researchers and developers improve the reasoning abilities of Large Language Models (LLMs) without needing large, pre-existing datasets. It provides a method and models to generate new, challenging questions from scratch, which are then used to train and refine LLMs. The output is an LLM that is better at solving complex problems, particularly in mathematics and reasoning tasks. It is ideal for AI researchers and practitioners focused on enhancing LLM capabilities.

No commits in the last 6 months.

Use this if you need to create high-quality, diverse question-and-answer datasets to train or fine-tune an LLM, especially when starting with limited or no seed data for complex reasoning problems.

Not ideal if you already have extensive, high-quality question-and-answer datasets or are looking for a simple, off-the-shelf solution for common language generation tasks.

AI research LLM training mathematical reasoning data synthesis question generation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 11 / 25

How are scores calculated?

Stars

68

Forks

7

Language

Python

License

Apache-2.0

Last pushed

Oct 27, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/yyDing1/ScaleQuest"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.