convei-lab/BotsTalk
🤖 Code for our EMNLP 2022 paper: "BotsTalk: Machine-sourced Framework for Automatic Curation of Large-scale Multi-skill Dialogue Datasets"
BotsTalk helps AI researchers and developers working on conversational AI to automatically create large-scale, multi-skill dialogue datasets. It takes existing dialogue datasets from various conversational skills as input and generates new, complex dialogues that combine these skills. This is useful for training more versatile and human-like chatbots.
No commits in the last 6 months.
Use this if you need to quickly generate extensive and diverse dialogue datasets for training advanced conversational AI models.
Not ideal if you are looking for a tool to manually annotate or curate small, specialized dialogue datasets.
Stars
16
Forks
1
Language
Python
License
MIT
Category
Last pushed
Oct 07, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/convei-lab/BotsTalk"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
gunthercox/chatterbot-corpus
A multilingual dialog corpus
EdinburghNLP/awesome-hallucination-detection
List of papers on hallucination detection in LLMs.
jfainberg/self_dialogue_corpus
The Self-dialogue Corpus - a collection of self-dialogues across music, movies and sports
jkkummerfeld/irc-disentanglement
Dataset and model for disentangling chat on IRC
Tomiinek/MultiWOZ_Evaluation
Unified MultiWOZ evaluation scripts for the context-to-response task.