aqlaboratory/proteinnet

Standardized data set for machine learning of protein structure

49
/ 100
Emerging

This project offers standardized protein sequences and structures (secondary and tertiary) along with multiple sequence alignments. It provides ready-to-use training, validation, and test datasets for machine learning research into protein structure prediction. Scientists and researchers in biochemistry or bioinformatics who are developing new computational methods for predicting protein shapes would use this.

910 stars. No commits in the last 6 months.

Use this if you are a researcher developing machine learning models for protein structure prediction and need a standardized, historically accurate dataset to benchmark your methods against established challenges.

Not ideal if you need access to the raw MSA data immediately for CASP 12 or if you are looking for a tool to perform protein structure prediction rather than a dataset to train models.

protein-structure-prediction biochemistry-research bioinformatics computational-biology machine-learning-datasets
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 23 / 25

How are scores calculated?

Stars

910

Forks

138

Language

Python

License

MIT

Last pushed

Nov 18, 2020

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/aqlaboratory/proteinnet"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.