Photoroom/datago
A natively parallel dataloader for Python, written in Rust. Serving data at GB/s speeds, while covering aspect ratio bucketing, crop and resize for image ML workloads.
When you're training machine learning models with large collections of images, this tool helps you efficiently load and prepare that data. It takes raw image files, or images from web archives or databases, and quickly outputs processed images ready for your model. This is designed for machine learning engineers and researchers working with image datasets that are too large or too slow to handle with standard methods.
127 stars.
Use this if you need to feed massive image datasets into your machine learning models at extremely high speeds, especially across multiple processing units, and require on-the-fly image adjustments like cropping and resizing.
Not ideal if your primary task is general-purpose data analysis, working with small datasets, or if your data consists mainly of text, tabular, or other non-image formats.
Stars
127
Forks
7
Language
Rust
License
MIT
Category
Last pushed
Feb 26, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Photoroom/datago"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
EnzymeAD/Enzyme
High-performance automatic differentiation of LLVM and MLIR.
Oxen-AI/Oxen
Lightning fast data version control system for structured and unstructured machine learning...
LaurentMazare/tch-rs
Rust bindings for the C++ api of PyTorch.
SunDoge/dlpark
A Rust Library for High-Performance Tensor Exchange with Python
TheMesocarp/koho
Full spectrum sheaf neural network over arbitrary CW complexes.