kjappelbaum/awesome-chemistry-datasets
overview of datasets for ML in chemistry
This is a curated collection of datasets for chemistry professionals working with machine learning. It provides readily available chemical text, molecular structures, and data for predicting molecular activities and properties. Chemists, materials scientists, and pharmaceutical researchers can use this to quickly find appropriate data for their AI/ML models.
394 stars.
Use this if you need to find specialized chemistry datasets for training machine learning models, covering everything from scientific literature to molecular structures and experimental property data.
Not ideal if you are looking for general-purpose datasets outside of chemistry or if you need tools for processing or analyzing the data itself, rather than just the datasets.
Stars
394
Forks
45
Language
—
License
CC0-1.0
Category
Last pushed
Oct 22, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/kjappelbaum/awesome-chemistry-datasets"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
josiehong/awesome-smallmol-massspec-ml
Awesome papers and codes list of small molecule mass spectrometry-related machine learning methods
GoekeLab/awesome-nanopore
A curated list of awesome nanopore analysis tools.
inoue0426/awesome-computational-biology
Awesome list of computational biology.
HongxinXiang/awesome-ai-bioinformatics
A curated list of awesome AI and Bioinformatics.
benb111/awesome-small-molecule-ml
A curated list of resources for machine learning for small-molecule drug discovery