Shekswess/synthgenai
SynthGenAI - Package for Generating Synthetic Datasets using LLMs.
SynthGenAI helps developers create diverse and useful datasets for training AI models. It takes a topic and domain, then uses various Large Language Models (LLMs) to generate synthetic data, such as instructions, that can be used to improve model performance. This tool is for AI/ML developers who need to quickly generate high-quality datasets when real-world data is scarce, expensive, or restricted.
Use this if you are an AI/ML developer needing to generate synthetic training data quickly for a specific topic or domain using large language models.
Not ideal if you are a non-technical user looking for an out-of-the-box solution for data generation without writing code.
Stars
54
Forks
4
Language
Python
License
MIT
Category
Last pushed
Nov 24, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/Shekswess/synthgenai"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
sdv-dev/SDV
Synthetic data generation for tabular data
sdv-dev/SDGym
Benchmarking synthetic data generation methods.
NVIDIA-NeMo/DataDesigner
🎨 NeMo Data Designer: A general library for generating high-quality synthetic data from scratch...
AlexanderVNikitin/tsgm
Generation and evaluation of synthetic time series datasets (also, augmentations,...
mostly-ai/mostlyai
Synthetic Data SDK ✨