projectglow/glow
An open-source toolkit for large-scale genomic analysis
This toolkit helps bioinformaticians and genetic researchers process extremely large genomic datasets, like those found in biobanks. It takes raw genomic data files (VCF, BGEN, Plink) and allows users to perform quality control, normalize variants, conduct genome-wide association studies, and integrate with other health data. The output is analyzed genomic insights, scaled to handle massive volumes of data.
296 stars.
Use this if you are a bioinformatician or geneticist working with very large genomic datasets and need to perform complex analyses and integrate different data types at scale.
Not ideal if you are working with small genomic datasets or prefer not to use the Apache Spark ecosystem for your analyses.
Stars
296
Forks
118
Language
Scala
License
Apache-2.0
Category
Last pushed
Feb 15, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/projectglow/glow"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
tensorflow/tfx
TFX is an end-to-end platform for deploying production ML pipelines
VowpalWabbit/vowpal_wabbit
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with...
yahoo/TensorFlowOnSpark
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Wei-1/Scala-Machine-Learning
No Dependency Scala Machine Learning Algorithm Gallery
yoshoku/rumale
Rumale is a machine learning library in Ruby