danijar/granular

Fast dataset format and loader

/ 100

Emerging

Granular is a tool for developers who work with large, complex datasets. It helps you store and load diverse data types like images, text, and numerical arrays efficiently. You put in raw data in various formats and get out a structured, performant dataset that can be easily accessed and processed, especially for machine learning workflows. It's designed for data engineers and machine learning engineers managing big data pipelines.

Available on PyPI.

Use this if you need a flexible and high-performance way to store and load custom datasets with diverse data types, especially when random access and resumable processing are important.

Not ideal if you're looking for a simple CSV or JSON file loader, or if your dataset is small and easily fits into memory.

data-engineering machine-learning-ops large-scale-data data-loading dataset-management

Maintenance 10 / 25

Adoption 6 / 25

Maturity 25 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

pykt-team/pykt-toolkit

pyKT: A Python Library to Benchmark Deep Learning based Knowledge Tracing Models

microsoft/archai

Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.

google-research/morph-net

Fast & Simple Resource-Constrained Learning of Deep Network Structure

IDEALLab/EngiBench

Benchmarks for automated engineering design

AI-team-UoA/pyJedAI

An open-source library that leverages Python’s data science ecosystem to build powerful...

Explore ML Frameworks

All categories Trending ML Framework directory Insights