tdspora/syngen
Open-source version of the TDspora synthetic data generation algorithm.
This tool helps you create realistic fake datasets from your existing tabular data, such as CSVs or Excel files, without revealing sensitive information. It takes your original dataset as input and generates a new, synthetic dataset that mimics the statistical properties of your real data. Data scientists, analysts, and anyone needing test data for development, training, or sharing would find this useful.
Available on PyPI.
Use this if you need to generate privacy-preserving test data from an existing tabular dataset for development, testing, or sharing with others.
Not ideal if you need to generate synthetic data without any original data as a template, or if your data is not in a tabular format (e.g., images, audio).
Stars
18
Forks
11
Language
Jupyter Notebook
License
GPL-3.0
Category
Last pushed
Mar 13, 2026
Commits (30d)
0
Dependencies
38
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/tdspora/syngen"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
meta-llama/synthetic-data-kit
Tool for generating high quality Synthetic datasets
Diyago/Tabular-data-generation
We well know GANs for success in the realistic image generation. However, they can be applied in...
Data-Centric-AI-Community/ydata-synthetic
Synthetic data generators for tabular and time-series data
vanderschaarlab/synthcity
A library for generating and evaluating synthetic tabular data for privacy, fairness and data...
always-further/deepfabric
Generate High-Quality Synthetics, Train, Measure, and Evaluate in a Single Pipeline