sdv-dev/SDGym
Benchmarking synthetic data generation methods.
This tool helps data practitioners evaluate and compare different methods for creating synthetic datasets. You input various synthetic data generation models and your original datasets, and it outputs detailed reports on performance, memory usage, and the quality and privacy of the generated synthetic data. Data scientists and machine learning engineers who work with sensitive or limited real-world data would find this useful.
301 stars. Used by 1 other package. Available on PyPI.
Use this if you need to reliably choose the best synthetic data generation technique for your specific data and use case by objectively benchmarking different models.
Not ideal if you are looking for a simple 'one-click' solution to generate synthetic data without needing to compare or customize underlying models.
Stars
301
Forks
67
Language
Python
License
—
Category
Last pushed
Mar 13, 2026
Commits (30d)
0
Dependencies
21
Reverse dependents
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/sdv-dev/SDGym"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
sdv-dev/SDV
Synthetic data generation for tabular data
NVIDIA-NeMo/DataDesigner
🎨 NeMo Data Designer: A general library for generating high-quality synthetic data from scratch...
AlexanderVNikitin/tsgm
Generation and evaluation of synthetic time series datasets (also, augmentations,...
mostly-ai/mostlyai
Synthetic Data SDK ✨
hitsz-ids/synthetic-data-generator
SDG is a specialized framework designed to generate high-quality structured tabular data.