laminlabs/lamindb
Open-source data framework for biology. Context and memory for datasets and models at scale. Query, trace & validate with a lineage-native lakehouse that supports bio-formats, registries & ontologies. 🍊YC S22
This framework helps biologists and researchers manage, query, and validate their complex biological datasets and models. It allows you to bring in various bio-formats (like AnnData or Zarr) and biological registries, and it outputs well-organized, traceable, and reproducible data and analysis results. It's designed for scientists and engineers working on biological R&D.
236 stars. Used by 1 other package. Available on PyPI.
Use this if you need to reliably trace the origins of your biological data and models, ensure data quality through validation, and query vast amounts of biological information across different experiments and formats.
Not ideal if your work doesn't involve biological data or if you primarily need a simple file storage solution without complex data lineage or validation.
Stars
236
Forks
22
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 18, 2026
Commits (30d)
0
Dependencies
1
Reverse dependents
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/laminlabs/lamindb"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
growthbook/growthbook
Open Source Feature Flags, Experimentation, and Product Analytics
koopjs/koop
Transform, query, and download geospatial data on the web.
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.