scDataset/scDataset

scDataset: Scalable Data Loading for Deep Learning on Large-Scale Single-Cell Omics

/ 100

Emerging

This tool helps single-cell omics researchers efficiently load and process massive datasets for deep learning. You provide your single-cell data, such as gene expression or protein measurements, in formats like AnnData or NumPy arrays. The tool then outputs well-structured batches of this data, ready for training deep learning models. It's designed for scientists and computational biologists working with very large single-cell genomics or proteomics datasets.

Available on PyPI.

Use this if you are a single-cell biologist or computational scientist training deep learning models on single-cell omics datasets that are too large to fit into memory.

Not ideal if your datasets are small enough to be loaded entirely into memory, or if you are not using deep learning for single-cell omics analysis.

single-cell-omics genomics proteomics computational-biology bioinformatics

Maintenance 10 / 25

Adoption 8 / 25

Maturity 24 / 25

Community 5 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

scverse/scanpy

Single-cell analysis in Python. Scales to >100M cells.

scverse/scvi-tools

Deep probabilistic analysis of single-cell and spatial omics data

Teichlab/celltypist

A tool for semi-automatic cell type classification

theislab/scarches

Reference mapping for single-cell genomics

Teichlab/cellhint

A tool for semi-automatic cell type harmonization and integration

Explore ML Frameworks

All categories Trending ML Framework directory Insights