jonathanking/sidechainnet
An all-atom protein structure dataset for machine learning.
This project offers a comprehensive dataset for scientists and researchers working with protein structures. It provides detailed, all-atom protein information, including both backbone and sidechain angles and coordinates, which goes beyond the backbone-only data found in other datasets. You can feed protein sequence data into machine learning models and generate predicted protein structures and their energy calculations, useful for drug discovery or protein engineering.
360 stars. No commits in the last 6 months. Available on PyPI.
Use this if you are a computational biologist or biochemist developing machine learning models to predict complete protein structures, analyze their energy, or visualize them in 3D.
Not ideal if you are looking for a simple protein viewer or a tool for basic sequence alignment, as this is focused on advanced machine learning applications.
Stars
360
Forks
40
Language
Python
License
BSD-3-Clause
Category
Last pushed
Mar 16, 2024
Commits (30d)
0
Dependencies
10
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/jonathanking/sidechainnet"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
DeepRank/deeprank2
An open-source deep learning framework for data mining of protein-protein interfaces or...
sacdallago/biotrainer
Biological prediction models made simple.
BioinfoMachineLearning/DIPS-Plus
The Enhanced Database of Interacting Protein Structures for Interface Prediction
a-r-j/ProteinWorkshop
Benchmarking framework for protein representation learning. Includes a large number of...
songlab-cal/tape
Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised...