GAIR-NLP/MegaScience

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

43
/ 100
Emerging

MegaScience helps scientists and researchers build or improve AI models designed for complex scientific reasoning. It provides high-quality datasets of millions of science reasoning questions and answers, extracted from university textbooks across seven disciplines. The output is an AI model that can understand and answer intricate scientific problems, making it easier to develop AI scientists or research assistants.

113 stars.

Use this if you are a researcher or institution looking to develop or enhance AI models that can accurately perform scientific reasoning tasks across various disciplines.

Not ideal if you need a pre-built, off-the-shelf AI model for general knowledge or non-scientific tasks.

scientific-research AI-development science-education reasoning-systems natural-sciences
No Package No Dependents
Maintenance 10 / 25
Adoption 9 / 25
Maturity 15 / 25
Community 9 / 25

How are scores calculated?

Stars

113

Forks

6

Language

Python

License

Apache-2.0

Last pushed

Feb 02, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/GAIR-NLP/MegaScience"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.