ydata-synthetic and Synthetic-data-gen

These are competitors offering overlapping functionality—both provide synthetic data generation for tabular datasets—though ydata-synthetic has broader scope with time-series support and significantly more community adoption.

ydata-synthetic

Established

Synthetic-data-gen

Emerging

Maintenance 10/25

Adoption 10/25

Maturity 16/25

Community 23/25

Maintenance 0/25

Adoption 9/25

Maturity 16/25

Community 21/25

Stars: 1,614

Forks: 257

Downloads: —

Commits (30d): 0

Language: Jupyter Notebook

License: MIT

Stars: 83

Forks: 42

Downloads: —

Commits (30d): 0

Language: Jupyter Notebook

License: MIT

No Package No Dependents

Stale 6m No Package No Dependents

About ydata-synthetic

Data-Centric-AI-Community/ydata-synthetic

Synthetic data generators for tabular and time-series data

This project helps data professionals, researchers, and analysts create artificial datasets that statistically mimic real-world tabular or time-series information. You provide an existing dataset (like customer demographics or stock prices), and it generates a new, synthetic dataset of the same type and structure. This is ideal for anyone who needs to work with data that has privacy concerns or is too small to be effective.

data-privacy dataset-augmentation data-sharing machine-learning-development data-balancing

About Synthetic-data-gen

tirthajyoti/Synthetic-data-gen

Various methods for generating synthetic data for data science and ML

This project helps data scientists and machine learning practitioners generate diverse datasets for training and testing algorithms. It takes your specifications for data characteristics—like the number of samples, features, statistical distributions, and desired complexity—and outputs synthetic datasets tailored for classification, regression, clustering, or time series problems. This is ideal for those learning new algorithms or needing to explore algorithm behavior under specific, controlled data conditions.

machine-learning-education algorithm-testing data-simulation model-training statistical-modeling

Related comparisons

ydata-synthetic and synthetic-data-kit

Scores updated daily from GitHub, PyPI, and npm data. How scores work