tbepler/prose

Multi-task and masked language model-based protein sequence embedding models.

44
/ 100
Emerging

This project helps biological researchers and computational biologists analyze protein sequences by converting raw protein sequences into numerical representations called embeddings. You provide protein sequences, typically in FASTA format, and it outputs a file containing these embeddings, which can then be used for downstream computational tasks like predicting protein function or structure. It's designed for those who need to computationally process and understand large sets of protein data.

106 stars. No commits in the last 6 months.

Use this if you need to transform raw protein sequences into a numerical format suitable for machine learning or other computational analyses in biology.

Not ideal if you are looking for a tool to directly predict protein structures or functions without needing to work with numerical embeddings.

protein-science bioinformatics computational-biology protein-engineering structural-biology
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

106

Forks

21

Language

Python

License

Last pushed

Jun 16, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/tbepler/prose"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.