yandex-research/tab-ddpm

[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"

/ 100

Established

This project helps data scientists and machine learning engineers create realistic synthetic datasets from existing tabular data. You input your original structured data (like a spreadsheet or database table), and it outputs a new, artificially generated dataset that mirrors the statistical properties of your original data. This is useful for tasks like sharing data while protecting privacy, augmenting small datasets, or testing models without using sensitive real-world information.

533 stars. No commits in the last 6 months.

Use this if you need to generate high-quality synthetic versions of your structured, numerical, and categorical datasets while preserving their statistical characteristics and protecting privacy.

Not ideal if you are working with unstructured data like images, text, or audio, or if you need to generate entirely new data that doesn't resemble an existing dataset.

data-privacy synthetic-data-generation data-augmentation machine-learning-engineering tabular-data

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

533

Forks

132

Language

Python

License

MIT

Related models

huggingface/diffusers

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

bghira/SimpleTuner

A general fine-tuning kit geared toward image/video/audio diffusion models.

mcmonkeyprojects/SwarmUI

SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an...

nateraw/stable-diffusion-videos

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

TheDesignFounder/DreamLayer

Benchmark diffusion models faster. Automate evals, seeds, and metrics for reproducible results.

Explore Diffusion Models

All categories Trending Diffusion directory Insights