scDataset/scDataset

scDataset: Scalable Data Loading for Deep Learning on Large-Scale Single-Cell Omics

47
/ 100
Emerging

This tool helps single-cell omics researchers efficiently load and process massive datasets for deep learning. You provide your single-cell data, such as gene expression or protein measurements, in formats like AnnData or NumPy arrays. The tool then outputs well-structured batches of this data, ready for training deep learning models. It's designed for scientists and computational biologists working with very large single-cell genomics or proteomics datasets.

Available on PyPI.

Use this if you are a single-cell biologist or computational scientist training deep learning models on single-cell omics datasets that are too large to fit into memory.

Not ideal if your datasets are small enough to be loaded entirely into memory, or if you are not using deep learning for single-cell omics analysis.

single-cell-omics genomics proteomics computational-biology bioinformatics
Maintenance 10 / 25
Adoption 8 / 25
Maturity 24 / 25
Community 5 / 25

How are scores calculated?

Stars

43

Forks

2

Language

Jupyter Notebook

License

MIT

Last pushed

Jan 30, 2026

Commits (30d)

0

Dependencies

2

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/scDataset/scDataset"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.