sigvt/vtuber-livechat-dataset

📊 VTuber 1B: Billion-scale Live Chat and Moderation Event Dataset

33
/ 100
Emerging

This project provides a massive collection of live chat messages, 'Super Chats' (paid messages), and moderation events (like bans and deletions) from virtual YouTubers' streams. It's designed for researchers, social scientists, or anyone studying online communities, trends, and content moderation. You can analyze raw chat data or pre-calculated statistics to understand viewer engagement, identify common spam/toxic phrases, or visualize demographic patterns.

No commits in the last 6 months.

Use this if you need a large-scale dataset of real-world live stream interactions to study audience behavior, language use, or the effectiveness of moderation in online entertainment communities.

Not ideal if you're looking for a small, curated dataset for quick qualitative analysis or if your research focus is outside of virtual streamer communities.

online-community-research social-media-analysis content-moderation virtual-youtubers audience-engagement
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 8 / 25

How are scores calculated?

Stars

92

Forks

5

Language

Python

License

MIT

Last pushed

Aug 04, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/sigvt/vtuber-livechat-dataset"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.