omarperacha/ps4-dataset
The largest open-source dataset for Protein Single Sequence Secondary Structure prediction.
This dataset and toolkit help protein scientists and bioinformaticians develop and evaluate models that predict a protein's 3D shape from its amino acid sequence. You input a protein's amino acid sequence, and the system predicts its secondary structure (alpha-helices, beta-sheets, etc.). It's used by researchers working on understanding protein function and drug discovery.
No commits in the last 6 months.
Use this if you are a researcher or bioinformatician building or improving computational models for predicting protein secondary structure.
Not ideal if you simply want to predict secondary structure for a few sequences without developing your own model; for that, use the provided Hugging Face Space.
Stars
14
Forks
1
Language
Python
License
CC0-1.0
Category
Last pushed
Dec 29, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/omarperacha/ps4-dataset"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
DeepRank/deeprank2
An open-source deep learning framework for data mining of protein-protein interfaces or...
sacdallago/biotrainer
Biological prediction models made simple.
jonathanking/sidechainnet
An all-atom protein structure dataset for machine learning.
BioinfoMachineLearning/DIPS-Plus
The Enhanced Database of Interacting Protein Structures for Interface Prediction
a-r-j/ProteinWorkshop
Benchmarking framework for protein representation learning. Includes a large number of...