projectglow/glow

An open-source toolkit for large-scale genomic analysis

60
/ 100
Established

This toolkit helps bioinformaticians and genetic researchers process extremely large genomic datasets, like those found in biobanks. It takes raw genomic data files (VCF, BGEN, Plink) and allows users to perform quality control, normalize variants, conduct genome-wide association studies, and integrate with other health data. The output is analyzed genomic insights, scaled to handle massive volumes of data.

296 stars.

Use this if you are a bioinformatician or geneticist working with very large genomic datasets and need to perform complex analyses and integrate different data types at scale.

Not ideal if you are working with small genomic datasets or prefer not to use the Apache Spark ecosystem for your analyses.

genomic-analysis bioinformatics genetic-research biobank-data population-genetics
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 24 / 25

How are scores calculated?

Stars

296

Forks

118

Language

Scala

License

Apache-2.0

Last pushed

Feb 15, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/projectglow/glow"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.