DACUS1995/pytorch-mmap-dataset
A custom pytorch Dataset extension that provides a faster iteration and better RAM usage
This tool helps machine learning engineers and researchers efficiently load large image or numerical datasets into PyTorch models. It takes your raw data, like image files, and creates a memory-mapped version, allowing for much faster training iterations and reduced RAM consumption. This is ideal for those working with extensive datasets that traditionally strain system memory during model training.
No commits in the last 6 months.
Use this if you are a machine learning practitioner experiencing slow data loading or out-of-memory errors when training PyTorch models on large datasets.
Not ideal if your datasets are small, fit comfortably in RAM, or if you are not using PyTorch for your machine learning workflows.
Stars
46
Forks
7
Language
Python
License
MIT
Category
Last pushed
Mar 14, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/DACUS1995/pytorch-mmap-dataset"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
opentensor/bittensor
Internet-scale Neural Networks
trailofbits/fickling
A Python pickling decompiler and static analyzer
benchopt/benchopt
A framework for reproducible, comparable benchmarks
BiomedSciAI/fuse-med-ml
A python framework accelerating ML based discovery in the medical field by encouraging code...
mosaicml/streaming
A Data Streaming Library for Efficient Neural Network Training