sbl-sdsc/mmtf-spark

Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.

42
/ 100
Emerging

This project helps structural biologists and biochemists efficiently analyze massive datasets of 3D protein structures, like the entire Protein Data Bank (PDB). It takes raw PDB files or MMTF-formatted structural data and allows for high-performance parallel processing to extract insights such as polypeptide chain statistics, structural alignments, or metadata. Researchers who need to perform large-scale computations on many protein structures will find this useful.

No commits in the last 6 months.

Use this if you need to perform complex queries or analyses on a very large collection of protein 3D structures and require the computational power of distributed processing.

Not ideal if you are working with individual protein structures or small datasets that do not require distributed computing resources.

structural-biology protein-analysis biomacromolecular-structures protein-data-bank biochemistry-research
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

21

Forks

31

Language

Java

License

Apache-2.0

Last pushed

Feb 01, 2019

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/sbl-sdsc/mmtf-spark"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.