ydata-synthetic and Synthetic-data-gen
These are competitors offering overlapping functionality—both provide synthetic data generation for tabular datasets—though ydata-synthetic has broader scope with time-series support and significantly more community adoption.
About ydata-synthetic
Data-Centric-AI-Community/ydata-synthetic
Synthetic data generators for tabular and time-series data
This project helps data professionals, researchers, and analysts create artificial datasets that statistically mimic real-world tabular or time-series information. You provide an existing dataset (like customer demographics or stock prices), and it generates a new, synthetic dataset of the same type and structure. This is ideal for anyone who needs to work with data that has privacy concerns or is too small to be effective.
About Synthetic-data-gen
tirthajyoti/Synthetic-data-gen
Various methods for generating synthetic data for data science and ML
This project helps data scientists and machine learning practitioners generate diverse datasets for training and testing algorithms. It takes your specifications for data characteristics—like the number of samples, features, statistical distributions, and desired complexity—and outputs synthetic datasets tailored for classification, regression, clustering, or time series problems. This is ideal for those learning new algorithms or needing to explore algorithm behavior under specific, controlled data conditions.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work