cahlen/conversation-dataset-generator
Craft conversational datasets (JSONL format with rich metadata) using LLMs. Specify parameters manually or use a creative brief for LLM-generated arguments with automatic topic/scenario variation. Optional web search improves persona grounding. Ideal for LoRA tuning, persona training, and creative writing. Includes Hugging Face Hub upload.
This tool helps content creators and AI trainers generate realistic, diverse conversational datasets. You provide descriptions for characters, topics, and scenarios, or a high-level creative brief, and it outputs structured conversations in JSONL format. This is ideal for anyone needing tailored dialogue examples to fine-tune large language models or develop rich character interactions.
No commits in the last 6 months.
Use this if you need to create synthetic conversations quickly for AI model training, character development, or generating varied dialogue for storytelling.
Not ideal if you need human-verified, real-world conversational data, as this tool generates synthetic content.
Stars
12
Forks
3
Language
Python
License
MIT
Category
Last pushed
Apr 17, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/cahlen/conversation-dataset-generator"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PaddlePaddle/PaddleNLP
Easy-to-use and powerful LLM and SLM library with awesome model zoo.
meta-llama/llama-cookbook
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started...
arcee-ai/mergekit
Tools for merging pretrained large language models.
changyeyu/LLM-RL-Visualized
๐100+ ๅๅ LLM / RL ๅ็ๅพ๐๏ผใๅคงๆจกๅ็ฎๆณใไฝ่ ๅทจ็ฎ๏ผ๐ฅ๏ผ100+ LLM/RL Algorithm Maps ๏ผ
mindspore-lab/step_into_llm
MindSpore online courses: Step into LLM