cahlen/conversation-dataset-generator

Craft conversational datasets (JSONL format with rich metadata) using LLMs. Specify parameters manually or use a creative brief for LLM-generated arguments with automatic topic/scenario variation. Optional web search improves persona grounding. Ideal for LoRA tuning, persona training, and creative writing. Includes Hugging Face Hub upload.

37
/ 100
Emerging

This tool helps content creators and AI trainers generate realistic, diverse conversational datasets. You provide descriptions for characters, topics, and scenarios, or a high-level creative brief, and it outputs structured conversations in JSONL format. This is ideal for anyone needing tailored dialogue examples to fine-tune large language models or develop rich character interactions.

No commits in the last 6 months.

Use this if you need to create synthetic conversations quickly for AI model training, character development, or generating varied dialogue for storytelling.

Not ideal if you need human-verified, real-world conversational data, as this tool generates synthetic content.

AI model training content generation dialogue authoring character development data synthesis
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 14 / 25

How are scores calculated?

Stars

12

Forks

3

Language

Python

License

MIT

Last pushed

Apr 17, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/cahlen/conversation-dataset-generator"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.