FunnySaltyFish/bilibili_comments_crawl

基于 B 站评论区数据构建大语言模型训练用对话数据集

24
/ 100
Experimental

This project helps content creators, social media analysts, or market researchers understand audience engagement on Bilibili videos. It takes a Bilibili video's ID and your login credentials to collect all associated comments and replies. The output is a structured conversation dataset, revealing how users interact and form discussion threads. This is for anyone interested in deep-diving into natural, multi-turn conversations from online video communities.

No commits in the last 6 months.

Use this if you need to gather authentic, multi-turn Chinese dialogue data from Bilibili video comment sections for qualitative analysis or to train conversational AI models.

Not ideal if you need a general-purpose web scraper for various websites, or if you require data beyond Bilibili comment threads.

social-media-research audience-analysis conversational-data chinese-web-content bilibili-analytics
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 8 / 25
Community 8 / 25

How are scores calculated?

Stars

59

Forks

4

Language

Python

License

Last pushed

Dec 16, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/FunnySaltyFish/bilibili_comments_crawl"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.