mrzjy/hoyo_public_wiki_parser
Parsing Hoyoverse game text corpus from public wikipedia
This project helps game enthusiasts, content creators, or fan community managers gather comprehensive text information about Hoyoverse games like Genshin Impact and Honkai: Star Rail. It takes publicly available wiki pages and social media posts as input and provides structured and unstructured text data about game lore, characters, and community discussions. This is useful for anyone analyzing game content or community sentiment.
No commits in the last 6 months.
Use this if you need to systematically collect and organize large amounts of text data from official wikis and social media platforms related to Hoyoverse games.
Not ideal if you are looking for real-time social media monitoring or if you need a solution that actively interacts with game APIs or private community forums.
Stars
12
Forks
3
Language
Python
License
—
Category
Last pushed
Aug 21, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/mrzjy/hoyo_public_wiki_parser"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
gunthercox/chatterbot-corpus
A multilingual dialog corpus
EdinburghNLP/awesome-hallucination-detection
List of papers on hallucination detection in LLMs.
jfainberg/self_dialogue_corpus
The Self-dialogue Corpus - a collection of self-dialogues across music, movies and sports
jkkummerfeld/irc-disentanglement
Dataset and model for disentangling chat on IRC
Tomiinek/MultiWOZ_Evaluation
Unified MultiWOZ evaluation scripts for the context-to-response task.