hitsz-ids/synthetic-data-generator
SDG is a specialized framework designed to generate high-quality structured tabular data.
This tool helps data professionals create artificial datasets that mimic the real characteristics of their original structured data, like customer records or transaction logs, without containing any sensitive information. You provide your existing tabular data, and it generates a new, privacy-compliant dataset that can be used for various purposes. It's designed for data scientists, analysts, and developers who need to work with data while adhering to privacy regulations.
2,409 stars.
Use this if you need to share, test, or train models with tabular data while protecting sensitive information and complying with privacy regulations like GDPR.
Not ideal if you need to generate unstructured data (like images or text documents) or if your primary goal is simple data anonymization without preserving statistical properties.
Stars
2,409
Forks
385
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 09, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/hitsz-ids/synthetic-data-generator"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
sdv-dev/SDV
Synthetic data generation for tabular data
sdv-dev/SDGym
Benchmarking synthetic data generation methods.
NVIDIA-NeMo/DataDesigner
🎨 NeMo Data Designer: A general library for generating high-quality synthetic data from scratch...
AlexanderVNikitin/tsgm
Generation and evaluation of synthetic time series datasets (also, augmentations,...
mostly-ai/mostlyai
Synthetic Data SDK ✨