rivas-lab/Smiles2Dock
Smiles2Dock: an open large-scale multi-task dataset for ML-based molecular docking (NeurIPS 2025 AI for Science Workshop)
This project offers a massive, ready-to-use dataset for predicting how well small molecules bind to target proteins. It takes chemical structures (SMILES strings) and protein structures (from AlphaFold) as input and outputs predicted binding scores. Medicinal chemists, computational biologists, and drug discovery researchers can use this to develop and benchmark machine learning models for virtual screening.
Use this if you are a researcher in drug discovery looking for a comprehensive dataset to train and validate machine learning models for molecular docking.
Not ideal if you are looking for an out-of-the-box tool to perform single molecular docking predictions without developing or training your own ML models.
Stars
10
Forks
—
Language
Python
License
Apache-2.0
Category
Last pushed
Nov 16, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/rivas-lab/Smiles2Dock"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
rxn4chemistry/rxn-onmt-models
Training of OpenNMT-based RXN models
CTCycle/ADSMOD-Adsorption-Modeling
Streamline adsorption modeling by automatically fitting theoretical adsorption models to...
sanjaradylov/smiles-gpt
Generative Pre-Training from Molecules
lamalab-org/MatText
Text-based modeling of materials.
VectorInstitute/atomgen
Library for handling atomistic graph datasets focusing on transformer-based implementations,...