RWKV-Wiki/MultilingualShareGPT

MultilingualShareGPT, the free multi-language corpus for LLM training

33
/ 100
Emerging

This project provides a large collection of real-world conversations, sourced from various online platforms, presented in a structured markdown format. It includes exchanges between humans and AI assistants across many languages. The output is a free, high-quality dataset perfect for anyone looking to improve or develop advanced AI models that can understand and generate text in multiple languages.

No commits in the last 6 months.

Use this if you are a researcher or AI model developer in need of a diverse, multilingual conversation dataset to train large language models.

Not ideal if you are looking for a dataset focused on highly technical code or highly specialized, niche domain knowledge.

AI-training-data natural-language-processing multilingual-AI conversation-AI machine-learning-datasets
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 8 / 25

How are scores calculated?

Stars

73

Forks

4

Language

License

CC0-1.0

Last pushed

Apr 06, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/RWKV-Wiki/MultilingualShareGPT"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.