vanderschaarlab/synthcity
A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
This tool helps data professionals create artificial datasets that look and behave like real-world data but don't contain any sensitive information. You input your original tabular data, and it outputs a new, synthetic dataset that can be shared or used for development without privacy concerns. This is ideal for data scientists, analysts, and researchers working with confidential information.
643 stars.
Use this if you need to generate high-quality synthetic versions of sensitive tabular, time-series, survival analysis, or image data for privacy-preserving analysis, sharing, or model development.
Not ideal if your original data contains missing values, as this tool requires data to be fully imputed beforehand.
Stars
643
Forks
90
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 11, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/vanderschaarlab/synthcity"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
Diyago/Tabular-data-generation
We well know GANs for success in the realistic image generation. However, they can be applied in...
meta-llama/synthetic-data-kit
Tool for generating high quality Synthetic datasets
Data-Centric-AI-Community/ydata-synthetic
Synthetic data generators for tabular and time-series data
tdspora/syngen
Open-source version of the TDspora synthetic data generation algorithm.
always-further/deepfabric
Generate High-Quality Synthetics, Train, Measure, and Evaluate in a Single Pipeline