tdspora/syngen

Open-source version of the TDspora synthetic data generation algorithm.

/ 100

Established

This tool helps you create realistic fake datasets from your existing tabular data, such as CSVs or Excel files, without revealing sensitive information. It takes your original dataset as input and generates a new, synthetic dataset that mimics the statistical properties of your real data. Data scientists, analysts, and anyone needing test data for development, training, or sharing would find this useful.

Available on PyPI.

Use this if you need to generate privacy-preserving test data from an existing tabular dataset for development, testing, or sharing with others.

Not ideal if you need to generate synthetic data without any original data as a template, or if your data is not in a tabular format (e.g., images, audio).

data-privacy data-masking test-data-generation data-anonymization data-simulation

Maintenance 10 / 25

Adoption 6 / 25

Maturity 25 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

GPL-3.0

Related frameworks

meta-llama/synthetic-data-kit

Tool for generating high quality Synthetic datasets

Diyago/Tabular-data-generation

We well know GANs for success in the realistic image generation. However, they can be applied in...

Data-Centric-AI-Community/ydata-synthetic

Synthetic data generators for tabular and time-series data

vanderschaarlab/synthcity

A library for generating and evaluating synthetic tabular data for privacy, fairness and data...

always-further/deepfabric

Generate High-Quality Synthetics, Train, Measure, and Evaluate in a Single Pipeline

Explore ML Frameworks

All categories Trending ML Framework directory Insights