webdataset/WebDataset.jl
A high performance I/O library for deep learning in Julia, based on the PyTorch WebDataset library
This helps deep learning practitioners efficiently load massive datasets for model training. It takes collections of tar files, where each tar file contains groups of related data (like an image and its label), and outputs ready-to-use batches of data. Data scientists and machine learning engineers working with large image, audio, or text datasets will find this useful for speeding up their training workflows.
Use this if you are a machine learning engineer or data scientist training deep learning models in Julia and need to handle very large datasets efficiently, especially when dealing with many small files.
Not ideal if your dataset is small, already in a single, easily loadable file format (like a CSV for tabular data), or if you are not working with deep learning models.
Stars
14
Forks
1
Language
Julia
License
MIT
Category
Last pushed
Dec 18, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/webdataset/WebDataset.jl"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
CliMA/Oceananigans.jl
🌊 Julia software for fast, friendly, flexible, ocean-flavored fluid dynamics on CPUs and GPUs
JuliaLang/julia
The Julia Programming Language
WassimTenachi/PhySO
Physical Symbolic Optimization
FluxML/Flux.jl
Relax! Flux is the ML library that doesn't make you tensor
EnzymeAD/Enzyme.jl
Julia bindings for the Enzyme automatic differentiator