ml6team/fondant
Production-ready data processing made easy and shareable
This tool helps machine learning engineers and data scientists collaboratively build, process, and share large datasets for AI model training. You start with raw data, apply a series of transformations like image resizing or text filtering, and end with a clean, structured dataset ready for your models. It streamlines the creation of production-ready datasets without manually handling complex data pipelines.
359 stars. Available on PyPI.
Use this if you need to build and manage complex, multi-step data processing workflows for machine learning, especially when working in a team or needing to reuse data operations.
Not ideal if you're dealing with very small datasets or require only simple, one-off data transformations that don't need sharing or scalable execution.
Stars
359
Forks
29
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 20, 2026
Commits (30d)
0
Dependencies
7
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/ml6team/fondant"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
treeverse/dvc
🦉 Data Versioning and ML Experiments
runpod/runpod-python
🐍 | Python library for RunPod API and serverless worker SDK.
microsoft/vscode-jupyter
VS Code Jupyter extension
4paradigm/OpenMLDB
OpenMLDB is an open-source machine learning database that provides a feature platform computing...
uber/petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning...