Hate Speech Detection Transformer Models

Tools and models for identifying, classifying, and mitigating hate speech, offensive language, and toxic content in text. Does NOT include general sentiment analysis, stance detection, or content moderation for non-hateful policy violations.

There are 54 hate speech detection models tracked. The highest-rated is StyrbjornKall/TRIDENT at 39/100 with 16 stars.

Get all 54 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=hate-speech-detection&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 StyrbjornKall/TRIDENT

A collection of transformer-based models and developmental scripts presented...

39
Emerging
2 Nithin-Holla/meme_challenge

Repository containing code from team Kingsterdam for the Hateful Memes Challenge

39
Emerging
3 viddexa/moderators

One package to moderate them all

38
Emerging
4 jaygala24/fed-hate-speech

The official code repository for the paper titled "A Federated Approach for...

34
Emerging
5 richouzo/hate-speech-detection-survey

Trained Neural Networks (LSTM, HybridCNN/LSTM, PyramidCNN, Transformers,...

33
Emerging
6 MusadiqPasha/Turkish-Hate-Speech-Classification-Explanation

Classify, explain, and rewrite Turkish hate speech tweets using BERT, SHAP,...

32
Emerging
7 ilias-ant/toxic-spans-detection

An attempt at SemEval 2021 Task 5: Toxic Spans Detection.

31
Emerging
8 avrtt/telegram-content-moderator

NLP/ViT-driven bot for detection & moredation of inappropriate content in...

30
Emerging
9 iqbal-sk/Detecting-Persuasion-Techniques-in-Memes

Hierarchical, multilingual, multimodal detection of persuasion techniques in...

30
Emerging
10 GU-DataLab/stance-detection-KE-MLM

Official resource of the paper "Knowledge Enhanced Masked Language Model for...

29
Experimental
11 nikhil6041/OLI-and-Meme-Classification

Author's implementation of the paper...

27
Experimental
12 cvcio/rtaa-classifier

Comments & Twitter accounts gRPC classification service.

26
Experimental
13 jdleo/tinysafe-1

71M parameter safety classifier (DeBERTa-v3-xsmall). Dual-head: binary...

23
Experimental
14 eftekhar-hossain/CUET_NLP-EACL_2021

This repository contains the system description and the codes that we...

23
Experimental
15 TimeLordRaps/satisfiable-ai

Verified training data for frontier AI. Every sample passes a SAT gate....

22
Experimental
16 jdleo/tinysafe-2

141M param safety model (not much better than v1, but a great learning)

22
Experimental
17 pkdubey/content_moderation

An AI-powered content moderation system using Python and Hugging Face...

22
Experimental
18 chuachinhon/transformers_state_trolls_cch

Detect state trolls on Twitter using Transformers + Comparison of results...

22
Experimental
19 mathildeoutters/Detect-patronizing-language

Participation to SemEval-2022 Task4 - Patronizing and Condescending Language...

21
Experimental
20 AditiBagora/Hasoc2021CodeMix

HASOC2021: Subtask 2 a) Codemix Challenge; Contains baselines and...

21
Experimental
21 premiouhxu4525/tinysafe-2

Classify text as safe or unsafe using a 141M parameter DeBERTa-v3 model with...

21
Experimental
22 YukiFujimatsu/Personalized-Flaming-Prediction

Implementation of personalized real-time flaming risk prediction model (IIAI...

21
Experimental
23 ArunavaKumar/offenseval-nlp

Transformer-based offensive language detection using DistilBERT embeddings,...

21
Experimental
24 KvaytG/ru-toxicity-detector

A simple toxicity detector.

21
Experimental
25 shruti-sivakumar/Multimodal-Hateful-Memes-Detection

Multimodal deep learning pipeline for hateful meme detection using ResNet50...

21
Experimental
26 nayanpreet/AI-Powered-Toxic-Comment-Detection-and-Moderation-System

Transformer-based multi-label toxicity classifier with GenAI-assisted...

21
Experimental
27 minuva/fast-nlp-text-toxicity

Fast text toxicity classification model

19
Experimental
28 training-datalab/gold-standard-toxicity

Gold Standard for Toxicity and Incivility Project

19
Experimental
29 Damarcreative/secure-upload

Remove adult content in discord channels better with Artificial Intelligence.

19
Experimental
30 hiyouga/Toxic_Detection

BUAA SCSE Autumn 2021 Machine Learning Group Homework

19
Experimental
31 muzmax/MSTAR_feature_extraction

General Feature Extraction in SAR Target Classification: A Contrastive...

18
Experimental
32 devroopsaha744/HateSpeechDetect-text

In this project, I focused on benchmarking various machine learning models,...

17
Experimental
33 HumasFurquan/Hate-Speech-Detection-2.0

End-to-end hate speech detection system using Transformer-based NLP models,...

17
Experimental
34 StyrbjornKall/TRIDENT_application

Source code for the web application associated with "Transformers enable...

13
Experimental
35 Brahmendra-Ramoju/TrustLayer_AI

AI-powered content moderation API with toxicity detection and trust scoring...

13
Experimental
36 Kirti-Vatsh/NLP---Toxic-Comment-Classification

Classifying toxic comments using NLP, machine learning, and deep learning....

13
Experimental
37 Fabio295/tinysafe-1

Detect harmful content with a 71M-parameter safety classifier using...

13
Experimental
38 yellatp/detoxify-telugu

A Fine-Tuned BERT-Based Language Model for Hate Speech Detection in Telugu & Tenglish

13
Experimental
39 jaychampaneri14/content-moderator

Multi-label content moderation for text and images

13
Experimental
40 lopezrbn/kaggle_toxicity_challenge

Multi-label toxic-comment classifier (DeBERTa v3, Kaggle Jigsaw Challenge) —...

13
Experimental
41 Bhawnakapri/DeepSignal-AI-Safety-Engine

Transformer-based AI Safety Intelligence System for multi-label...

13
Experimental
42 kanincityy/misogyny_detection_transformers

Building an Effective Misogyny Detection Classifier for Low-Resource Languages

13
Experimental
43 rafelps/HLE-UPC-SemEval-2021-ToxicSpansDetection

HLE-UPC at SemEval-2021 Task 5: Toxic Spans Detection

12
Experimental
44 Mussabat/HateSpeech-EACL-2024

This repository contains the system description and the codes that we...

12
Experimental
45 MagixIsAvailable/nlp_toxic_language

A Real-Time AI Safety Filter that detects toxic speech from live audio using...

12
Experimental
46 yaekobB/Toxic-Comment-Classification

Multi-label toxic comment classification using DistilBERT with explainable...

12
Experimental
47 karish-grover/Humor-Analysis-using-Ensembles-of-Simple-Transformers

This paper describes Humor Analysis using Ensembles of Simple Transformers,...

12
Experimental
48 imdiptanu/MAD

MAD: A Multi-task Aggression Detection Framework

11
Experimental
49 rkeerthikant/proposed-model-for-trolls-detection

Fine tuning DistilBERT, BERT-medium-uncased, T5-Small, MobileBERT for...

11
Experimental
50 Anshumaan-Chauhan02/HumanVsAI-Sarcasm-Detection

Large Language Model performing a binary classification task of detecting...

11
Experimental
51 pavel-kalmykov/stance-detection-for-spanish-and-catalan

Transformers applied to Stance Detection for Spanish and Catalan languages

11
Experimental
52 romsto/Inappropriate-Language-Classifier

Online video games need a better system to detect inappropriate language in...

11
Experimental
53 neil-ab/contextual-hatespeech-detection

A contextual (BERT-based) hate speech detection model

10
Experimental
54 rajendranu4/stance-detection

Exploration of Contrastive Learning Strategies toward more Robust Stance Detection

10
Experimental

Comparisons in this category