kjappelbaum/awesome-chemistry-datasets

overview of datasets for ML in chemistry

49
/ 100
Emerging

This is a curated collection of datasets for chemistry professionals working with machine learning. It provides readily available chemical text, molecular structures, and data for predicting molecular activities and properties. Chemists, materials scientists, and pharmaceutical researchers can use this to quickly find appropriate data for their AI/ML models.

394 stars.

Use this if you need to find specialized chemistry datasets for training machine learning models, covering everything from scientific literature to molecular structures and experimental property data.

Not ideal if you are looking for general-purpose datasets outside of chemistry or if you need tools for processing or analyzing the data itself, rather than just the datasets.

computational-chemistry drug-discovery materials-science cheminformatics chemical-research
No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

394

Forks

45

Language

License

CC0-1.0

Last pushed

Oct 22, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/kjappelbaum/awesome-chemistry-datasets"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.