FunnySaltyFish/bilibili_comments_crawl

基于 B 站评论区数据构建大语言模型训练用对话数据集

/ 100

Experimental

This project helps content creators, social media analysts, or market researchers understand audience engagement on Bilibili videos. It takes a Bilibili video's ID and your login credentials to collect all associated comments and replies. The output is a structured conversation dataset, revealing how users interact and form discussion threads. This is for anyone interested in deep-diving into natural, multi-turn conversations from online video communities.

No commits in the last 6 months.

Use this if you need to gather authentic, multi-turn Chinese dialogue data from Bilibili video comment sections for qualitative analysis or to train conversational AI models.

Not ideal if you need a general-purpose web scraper for various websites, or if you require data beyond Bilibili comment threads.

social-media-research audience-analysis conversational-data chinese-web-content bilibili-analytics

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 8 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

AI-Planning/l2p

Library for LLM-driven action model acquisition via natural language

datawhalechina/self-llm

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程

microsoft/LMOps

General technology for enabling AI capabilities w/ LLMs and MLLMs

theaniketgiri/create-llm

The fastest way to build and start training your own LLM. CLI tool that scaffolds...

liguodongiot/llm-action

本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

Explore LLM Tools

All categories Trending LLM Tool directory Insights