SDGym and synthetic-data-generator
One tool benchmarks synthetic data generation methods while the other is a specialized framework for generating high-quality structured tabular data, making them complementary where the framework could be one of the methods benchmarked by the other.
About SDGym
sdv-dev/SDGym
Benchmarking synthetic data generation methods.
This tool helps data practitioners evaluate and compare different methods for creating synthetic datasets. You input various synthetic data generation models and your original datasets, and it outputs detailed reports on performance, memory usage, and the quality and privacy of the generated synthetic data. Data scientists and machine learning engineers who work with sensitive or limited real-world data would find this useful.
About synthetic-data-generator
hitsz-ids/synthetic-data-generator
SDG is a specialized framework designed to generate high-quality structured tabular data.
This tool helps data professionals create artificial datasets that mimic the real characteristics of their original structured data, like customer records or transaction logs, without containing any sensitive information. You provide your existing tabular data, and it generates a new, privacy-compliant dataset that can be used for various purposes. It's designed for data scientists, analysts, and developers who need to work with data while adhering to privacy regulations.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work