franciellevargas/HateBR
HateBR is the first large-scale expert annotated dataset of Brazilian Instagram comments for hate speech and offensive language detection on the web and social media.
HateBR is a dataset designed to help identify hate speech and offensive language in Brazilian Portuguese social media comments. It takes Instagram comments, primarily directed at politicians, and classifies them as either offensive or non-offensive. The output is a categorized set of comments, which can be used to train and evaluate automated systems for content moderation or social listening. This is ideal for social media analysts, content moderation teams, or researchers studying online communication in Brazil.
Use this if you need to build or evaluate a system for automatically detecting offensive language or hate speech in Brazilian Portuguese social media content.
Not ideal if your focus is on a language other than Brazilian Portuguese, or if you need to analyze content from platforms other than Instagram comments.
Stars
45
Forks
8
Language
—
License
—
Category
Last pushed
Jan 05, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/franciellevargas/HateBR"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Hironsan/HateSonar
Hate Speech Detection Library for Python.
t-davidson/hate-speech-and-offensive-language
Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive...
rishabhmisra/News-Headlines-Dataset-For-Sarcasm-Detection
High quality dataset for the task of Sarcasm Detection
b4k0/CBDA
Cyber Bullying Detection Application (CBDA)
raklugrin01/Disaster-Tweets-Analysis-and-Classification
Analysing Disaster related tweets dataset and build a classifier using deep learning and deploy...