SDGym and synthetic-data-generator

One tool benchmarks synthetic data generation methods while the other is a specialized framework for generating high-quality structured tabular data, making them complementary where the framework could be one of the methods benchmarked by the other.

SDGym
69
Established
synthetic-data-generator
59
Established
Maintenance 10/25
Adoption 11/25
Maturity 25/25
Community 23/25
Maintenance 10/25
Adoption 10/25
Maturity 16/25
Community 23/25
Stars: 301
Forks: 67
Downloads:
Commits (30d): 0
Language: Python
License:
Stars: 2,409
Forks: 385
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
No risk flags
No Package No Dependents

About SDGym

sdv-dev/SDGym

Benchmarking synthetic data generation methods.

This tool helps data practitioners evaluate and compare different methods for creating synthetic datasets. You input various synthetic data generation models and your original datasets, and it outputs detailed reports on performance, memory usage, and the quality and privacy of the generated synthetic data. Data scientists and machine learning engineers who work with sensitive or limited real-world data would find this useful.

data-science machine-learning-engineering data-privacy data-anonymization synthetic-data-generation

About synthetic-data-generator

hitsz-ids/synthetic-data-generator

SDG is a specialized framework designed to generate high-quality structured tabular data.

This tool helps data professionals create artificial datasets that mimic the real characteristics of their original structured data, like customer records or transaction logs, without containing any sensitive information. You provide your existing tabular data, and it generates a new, privacy-compliant dataset that can be used for various purposes. It's designed for data scientists, analysts, and developers who need to work with data while adhering to privacy regulations.

data privacy data sharing model training data testing tabular data analysis

Related comparisons

Scores updated daily from GitHub, PyPI, and npm data. How scores work