kocohub/korean-hate-speech

Korean HateSpeech Dataset

/ 100

Emerging

This dataset offers a collection of comments from Korean entertainment news, specifically designed to help identify and analyze toxic speech. It includes human-annotated comments labeled for social bias (gender, others, none) and hate speech (hate, offensive, none), alongside a larger set of unlabeled comments and associated news titles. Professionals in social media analysis, content moderation, or linguistic research focusing on Korean online discourse would find this valuable.

395 stars. No commits in the last 6 months.

Use this if you need to understand, detect, or research hate speech and social bias within Korean online comments, especially those related to entertainment news.

Not ideal if your focus is on general sentiment analysis or non-toxic comment classification, or if your primary language of interest is not Korean.

content-moderation social-media-analysis korean-linguistics online-safety public-discourse

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

395

Forks

Language

—

License

CC-BY-SA-4.0

Higher-rated alternatives

unitaryai/detoxify

Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built...

kensk8er/chicksexer

A Python package for gender classification.

Infinitode/ValX

ValX is an open-source Python package for text cleaning tasks, including profanity detection and...

kocohub/korean-hate-speech

Higher-rated alternatives

Explore NLP Tools