sbl-sdsc/mmtf-spark

Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.

/ 100

Emerging

This project helps structural biologists and biochemists efficiently analyze massive datasets of 3D protein structures, like the entire Protein Data Bank (PDB). It takes raw PDB files or MMTF-formatted structural data and allows for high-performance parallel processing to extract insights such as polypeptide chain statistics, structural alignments, or metadata. Researchers who need to perform large-scale computations on many protein structures will find this useful.

No commits in the last 6 months.

Use this if you need to perform complex queries or analyses on a very large collection of protein 3D structures and require the computational power of distributed processing.

Not ideal if you are working with individual protein structures or small datasets that do not require distributed computing resources.

structural-biology protein-analysis biomacromolecular-structures protein-data-bank biochemistry-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

Forks

Language

Java

License

Apache-2.0

Higher-rated alternatives

lensacom/sparkit-learn

PySpark + Scikit-learn = Sparkit-learn

Angel-ML/angel

A Flexible and Powerful Parameter Server for large-scale machine learning

flink-extended/dl-on-flink

Deep Learning on Flink aims to integrate Flink and deep learning frameworks (e.g. TensorFlow,...

MingChen0919/learning-apache-spark

Notes on Apache Spark (pyspark)

mahmoudparsian/data-algorithms-book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

Explore ML Frameworks

All categories Trending ML Framework directory Insights