Protein Language Models ML Frameworks

Tools for training and applying transformer-based language models on protein sequences for tasks like fitness prediction, stability estimation, and property inference. Does NOT include structure prediction, sequence alignment, or general protein embeddings without generative/discriminative language modeling.

There are 60 protein language models frameworks tracked. 6 score above 50 (established tier). The highest-rated is DeepRank/deeprank2 at 61/100 with 57 stars.

Get all 60 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=protein-language-models&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Framework Score Tier
1 DeepRank/deeprank2

An open-source deep learning framework for data mining of protein-protein...

61
Established
2 sacdallago/biotrainer

Biological prediction models made simple.

60
Established
3 jonathanking/sidechainnet

An all-atom protein structure dataset for machine learning.

52
Established
4 a-r-j/ProteinWorkshop

Benchmarking framework for protein representation learning. Includes a large...

50
Established
5 songlab-cal/tape

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically...

50
Established
6 BioinfoMachineLearning/DIPS-Plus

The Enhanced Database of Interacting Protein Structures for Interface Prediction

50
Established
7 flatironinstitute/DeepFRI

Deep functional residue identification

49
Emerging
8 aqlaboratory/proteinnet

Standardized data set for machine learning of protein structure

49
Emerging
9 idrblab/AnnoPRO

Feature map and function annotation of Proteins

48
Emerging
10 jaswindersingh2/SPOT-RNA

RNA Secondary Structure Prediction using an Ensemble of Two-dimensional Deep...

48
Emerging
11 LBM-EPFL/PeSTo

Geometric deep learning method to predict protein binding interfaces from a...

44
Emerging
12 jonathanking/protein-transformer

Predicting protein structure through sequence modeling

42
Emerging
13 michaelhla/pro-1

reasoning model trained using GRPO towards rosetta REF2015 for protein stability

42
Emerging
14 HannesStark/protein-localization

Using Transformer protein embeddings with a linear attention mechanism to...

41
Emerging
15 vsomnath/holoprot

Multi-Scale Representation Learning on Proteins (NeurIPS 2021)

40
Emerging
16 anton-bushuiev/PPIformer

Learning to design protein-protein interactions with enhanced generalization...

40
Emerging
17 adaptyvbio/ProteinFlow

Versatile computational pipeline for processing protein structure data for...

38
Emerging
18 anton-bushuiev/PPIRef

Dataset and package for working with protein-protein interactions in 3D

38
Emerging
19 aws-samples/lm-gvp

LM-GVP: A Generalizable Deep Learning Framework for Protein Property...

37
Emerging
20 victor369basu/ProteinStructurePrediction

Protein structure prediction is the task of predicting the 3-dimensional...

37
Emerging
21 vam-sin/CATHe

Deep Learning tool trained on protein sequence embeddings from protein...

37
Emerging
22 conradry/prtm

Deep learning for protein science

35
Emerging
23 lightonai/RITA

RITA is a family of autoregressive protein models, developed by LightOn in...

35
Emerging
24 OpenProteinAI/PoET

Inference code for PoET: A generative model of protein families as...

35
Emerging
25 dohlee/abyssal-pytorch

Implementation of Abyssal, a deep neural network trained with a new "mega"...

34
Emerging
26 QizhiPei/BioT5

BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)

34
Emerging
27 draeger-lab/TFpredict

Identification and structural characterization of transcription factors...

34
Emerging
28 MachineLearningLifeScience/protein_regression

The codebase to replicate the analysis of "A systematic analysis of...

33
Emerging
29 bioinfodlsu/phage-host-prediction

Published in PLOS ONE. Phage-host interaction prediction tool that uses...

33
Emerging
30 dohlee/rasp-pytorch

Reimplementation of RaSP, a deep neural network for rapid protein stability...

32
Emerging
31 Bitbol-Lab/DiffPALM

Differentiable Pairing using Alignment-based Language Models

32
Emerging
32 google-research/slip

SLIP is a sandbox environment for engineering protein sequences with...

31
Emerging
33 jiaqingxie/DeepProtein

Deep Learning Library and Benchmark for Protein Sequence Learning...

30
Emerging
34 milagjurovska/PPI-link-prediction-with-optimized-gcn-and-gan

Comparing different biologically inspired algorithms in hyperparameter...

30
Emerging
35 JieZheng-ShanghaiTech/SL_benchmark

Benchmarking study of machine learning methods for prediction of synthetic lethality

30
Emerging
36 kiwijuice56/protein-visualizer

Visualizing the function of biological proteins through deep learning. MIT...

29
Experimental
37 jgbrasier/protein-classification

Deep sequence models for protein classification

28
Experimental
38 daisybio/data-leakage-ppi-prediction

Code associated with the paper 'Cracking the blackbox of deep sequence-based...

27
Experimental
39 omarperacha/ps4-dataset

The largest open-source dataset for Protein Single Sequence Secondary...

27
Experimental
40 310-ai/lib310

lib310 python package

25
Experimental
41 NIGMS/Protein-Protein-Interactions-using-ML

In this module, you will harness novel machine learning techniques to...

24
Experimental
42 Ulton321/Protein-Language-Model-Steering

Protein-Language-Model-Steering explores how to guide or "steer" large...

24
Experimental
43 bioinfodlsu/PHIStruct

Published in Bioinformatics. Phage-host interaction prediction tool that...

23
Experimental
44 MPI-Dortmund/pymissense

PyMissense creates the pathogenicity plot and modified pdb as shown in the...

22
Experimental
45 jsmccabe1/ApiPred

Predict fitness phenotypes and invasion machinery in apicomplexan parasites...

22
Experimental
46 shruti-sivakumar/MSA-Comparative-Study

Benchmarking 6 MSA tools (Clustal Omega, MUSCLE, MAGUS, M-Coffee, MSA Probs,...

21
Experimental
47 DeepFoldProtein/OTalign

OTalign: Protein sequence alignment for remote homologs using Protein...

20
Experimental
48 kren-ai-lab/RUDEUS

Developing classification models for DNA-Binding proteins through machine...

19
Experimental
49 phenolophthaleinum/phastDNA

Virus-host interaction prediction using local fluctuations of genome...

19
Experimental
50 claopodium/SLP-for-Bio

Based on Single Layer Perceptron model, the programme intended to locate key...

18
Experimental
51 allamiro/9mers-structure-prediction

Protein structure prediction for CullPDB 9-mer fragments using multi-input...

17
Experimental
52 A-Hareed/BackMapNet

BackMapNet is a deep-learning framework for reconstructing all-atom protein...

14
Experimental
53 Faluminus/Sec-PRED

Protein secondary structure prediction from amino acid sequence using...

14
Experimental
54 mciaravino/yeast-protein-classification

Multiclass classification of yeast protein localization sites using multiple...

11
Experimental
55 kalininalab/NaturalPPLuM

Repository of the paper "Exploring sequence landscape of biosynthetic gene...

11
Experimental
56 FAhtisham/Protein-Crotonylation

The repository contains the codes to predict protein crotonylation, existing...

11
Experimental
57 DSIMB/PYTHIA

Deep Learning Approach For Local Protein Conformation Prediction

11
Experimental
58 dohlee/tranception-pytorch

Implementation of Tranception, a SOTA transformer model for protein fitness...

11
Experimental
59 albaaggbb/protein-classification-ann

Deep learning approach for multiclass classification of mice based on...

10
Experimental
60 khanovico/neural-struct-detect

deep neural network for detecting structure of protein

10
Experimental