ml6team/fondant

Production-ready data processing made easy and shareable

/ 100

Established

This tool helps machine learning engineers and data scientists collaboratively build, process, and share large datasets for AI model training. You start with raw data, apply a series of transformations like image resizing or text filtering, and end with a clean, structured dataset ready for your models. It streamlines the creation of production-ready datasets without manually handling complex data pipelines.

359 stars. Available on PyPI.

Use this if you need to build and manage complex, multi-step data processing workflows for machine learning, especially when working in a team or needing to reuse data operations.

Not ideal if you're dealing with very small datasets or require only simple, one-off data transformations that don't need sharing or scalable execution.

MLOps data engineering dataset preparation AI model training data pipeline management

Maintenance 10 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 14 / 25

How are scores calculated?

Stars

359

Forks

Language

Python

License

Apache-2.0

Related frameworks

treeverse/dvc

🦉 Data Versioning and ML Experiments

runpod/runpod-python

🐍 | Python library for RunPod API and serverless worker SDK.

microsoft/vscode-jupyter

VS Code Jupyter extension

4paradigm/OpenMLDB

OpenMLDB is an open-source machine learning database that provides a feature platform computing...

uber/petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning...

Explore ML Frameworks

All categories Trending ML Framework directory Insights