daisybio/data-leakage-ppi-prediction
Code associated with the paper 'Cracking the blackbox of deep sequence-based protein-protein interaction prediction'
This project critically evaluates existing computational methods for predicting protein-protein interactions (PPIs) based on protein sequences. It takes established PPI datasets and protein sequence information, then re-analyzes their performance under strict conditions to reveal if their high accuracy is due to learning from true biological signals or from hidden data similarities. This tool is designed for computational biologists and biochemists who develop or assess deep learning models for PPI prediction.
No commits in the last 6 months.
Use this if you are a researcher in bioinformatics or computational biology looking to understand the limitations and potential biases in current deep sequence-based protein-protein interaction prediction models.
Not ideal if you are looking for a new, robust method to predict protein-protein interactions without a focus on critically evaluating existing models' methodologies.
Stars
27
Forks
1
Language
C++
License
GPL-3.0
Category
Last pushed
Jan 08, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/daisybio/data-leakage-ppi-prediction"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
DeepRank/deeprank2
An open-source deep learning framework for data mining of protein-protein interfaces or...
sacdallago/biotrainer
Biological prediction models made simple.
jonathanking/sidechainnet
An all-atom protein structure dataset for machine learning.
a-r-j/ProteinWorkshop
Benchmarking framework for protein representation learning. Includes a large number of...
songlab-cal/tape
Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised...