Hate Speech Detection Transformer Models
Tools and models for identifying, classifying, and mitigating hate speech, offensive language, and toxic content in text. Does NOT include general sentiment analysis, stance detection, or content moderation for non-hateful policy violations.
There are 54 hate speech detection models tracked. The highest-rated is StyrbjornKall/TRIDENT at 39/100 with 16 stars.
Get all 54 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=hate-speech-detection&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
StyrbjornKall/TRIDENT
A collection of transformer-based models and developmental scripts presented... |
|
Emerging |
| 2 |
Nithin-Holla/meme_challenge
Repository containing code from team Kingsterdam for the Hateful Memes Challenge |
|
Emerging |
| 3 |
viddexa/moderators
One package to moderate them all |
|
Emerging |
| 4 |
jaygala24/fed-hate-speech
The official code repository for the paper titled "A Federated Approach for... |
|
Emerging |
| 5 |
richouzo/hate-speech-detection-survey
Trained Neural Networks (LSTM, HybridCNN/LSTM, PyramidCNN, Transformers,... |
|
Emerging |
| 6 |
MusadiqPasha/Turkish-Hate-Speech-Classification-Explanation
Classify, explain, and rewrite Turkish hate speech tweets using BERT, SHAP,... |
|
Emerging |
| 7 |
ilias-ant/toxic-spans-detection
An attempt at SemEval 2021 Task 5: Toxic Spans Detection. |
|
Emerging |
| 8 |
avrtt/telegram-content-moderator
NLP/ViT-driven bot for detection & moredation of inappropriate content in... |
|
Emerging |
| 9 |
iqbal-sk/Detecting-Persuasion-Techniques-in-Memes
Hierarchical, multilingual, multimodal detection of persuasion techniques in... |
|
Emerging |
| 10 |
GU-DataLab/stance-detection-KE-MLM
Official resource of the paper "Knowledge Enhanced Masked Language Model for... |
|
Experimental |
| 11 |
nikhil6041/OLI-and-Meme-Classification
Author's implementation of the paper... |
|
Experimental |
| 12 |
cvcio/rtaa-classifier
Comments & Twitter accounts gRPC classification service. |
|
Experimental |
| 13 |
jdleo/tinysafe-1
71M parameter safety classifier (DeBERTa-v3-xsmall). Dual-head: binary... |
|
Experimental |
| 14 |
eftekhar-hossain/CUET_NLP-EACL_2021
This repository contains the system description and the codes that we... |
|
Experimental |
| 15 |
TimeLordRaps/satisfiable-ai
Verified training data for frontier AI. Every sample passes a SAT gate.... |
|
Experimental |
| 16 |
jdleo/tinysafe-2
141M param safety model (not much better than v1, but a great learning) |
|
Experimental |
| 17 |
pkdubey/content_moderation
An AI-powered content moderation system using Python and Hugging Face... |
|
Experimental |
| 18 |
chuachinhon/transformers_state_trolls_cch
Detect state trolls on Twitter using Transformers + Comparison of results... |
|
Experimental |
| 19 |
mathildeoutters/Detect-patronizing-language
Participation to SemEval-2022 Task4 - Patronizing and Condescending Language... |
|
Experimental |
| 20 |
AditiBagora/Hasoc2021CodeMix
HASOC2021: Subtask 2 a) Codemix Challenge; Contains baselines and... |
|
Experimental |
| 21 |
premiouhxu4525/tinysafe-2
Classify text as safe or unsafe using a 141M parameter DeBERTa-v3 model with... |
|
Experimental |
| 22 |
YukiFujimatsu/Personalized-Flaming-Prediction
Implementation of personalized real-time flaming risk prediction model (IIAI... |
|
Experimental |
| 23 |
ArunavaKumar/offenseval-nlp
Transformer-based offensive language detection using DistilBERT embeddings,... |
|
Experimental |
| 24 |
KvaytG/ru-toxicity-detector
A simple toxicity detector. |
|
Experimental |
| 25 |
shruti-sivakumar/Multimodal-Hateful-Memes-Detection
Multimodal deep learning pipeline for hateful meme detection using ResNet50... |
|
Experimental |
| 26 |
nayanpreet/AI-Powered-Toxic-Comment-Detection-and-Moderation-System
Transformer-based multi-label toxicity classifier with GenAI-assisted... |
|
Experimental |
| 27 |
minuva/fast-nlp-text-toxicity
Fast text toxicity classification model |
|
Experimental |
| 28 |
training-datalab/gold-standard-toxicity
Gold Standard for Toxicity and Incivility Project |
|
Experimental |
| 29 |
Damarcreative/secure-upload
Remove adult content in discord channels better with Artificial Intelligence. |
|
Experimental |
| 30 |
hiyouga/Toxic_Detection
BUAA SCSE Autumn 2021 Machine Learning Group Homework |
|
Experimental |
| 31 |
muzmax/MSTAR_feature_extraction
General Feature Extraction in SAR Target Classification: A Contrastive... |
|
Experimental |
| 32 |
devroopsaha744/HateSpeechDetect-text
In this project, I focused on benchmarking various machine learning models,... |
|
Experimental |
| 33 |
HumasFurquan/Hate-Speech-Detection-2.0
End-to-end hate speech detection system using Transformer-based NLP models,... |
|
Experimental |
| 34 |
StyrbjornKall/TRIDENT_application
Source code for the web application associated with "Transformers enable... |
|
Experimental |
| 35 |
Brahmendra-Ramoju/TrustLayer_AI
AI-powered content moderation API with toxicity detection and trust scoring... |
|
Experimental |
| 36 |
Kirti-Vatsh/NLP---Toxic-Comment-Classification
Classifying toxic comments using NLP, machine learning, and deep learning.... |
|
Experimental |
| 37 |
Fabio295/tinysafe-1
Detect harmful content with a 71M-parameter safety classifier using... |
|
Experimental |
| 38 |
yellatp/detoxify-telugu
A Fine-Tuned BERT-Based Language Model for Hate Speech Detection in Telugu & Tenglish |
|
Experimental |
| 39 |
jaychampaneri14/content-moderator
Multi-label content moderation for text and images |
|
Experimental |
| 40 |
lopezrbn/kaggle_toxicity_challenge
Multi-label toxic-comment classifier (DeBERTa v3, Kaggle Jigsaw Challenge) —... |
|
Experimental |
| 41 |
Bhawnakapri/DeepSignal-AI-Safety-Engine
Transformer-based AI Safety Intelligence System for multi-label... |
|
Experimental |
| 42 |
kanincityy/misogyny_detection_transformers
Building an Effective Misogyny Detection Classifier for Low-Resource Languages |
|
Experimental |
| 43 |
rafelps/HLE-UPC-SemEval-2021-ToxicSpansDetection
HLE-UPC at SemEval-2021 Task 5: Toxic Spans Detection |
|
Experimental |
| 44 |
Mussabat/HateSpeech-EACL-2024
This repository contains the system description and the codes that we... |
|
Experimental |
| 45 |
MagixIsAvailable/nlp_toxic_language
A Real-Time AI Safety Filter that detects toxic speech from live audio using... |
|
Experimental |
| 46 |
yaekobB/Toxic-Comment-Classification
Multi-label toxic comment classification using DistilBERT with explainable... |
|
Experimental |
| 47 |
karish-grover/Humor-Analysis-using-Ensembles-of-Simple-Transformers
This paper describes Humor Analysis using Ensembles of Simple Transformers,... |
|
Experimental |
| 48 |
imdiptanu/MAD
MAD: A Multi-task Aggression Detection Framework |
|
Experimental |
| 49 |
rkeerthikant/proposed-model-for-trolls-detection
Fine tuning DistilBERT, BERT-medium-uncased, T5-Small, MobileBERT for... |
|
Experimental |
| 50 |
Anshumaan-Chauhan02/HumanVsAI-Sarcasm-Detection
Large Language Model performing a binary classification task of detecting... |
|
Experimental |
| 51 |
pavel-kalmykov/stance-detection-for-spanish-and-catalan
Transformers applied to Stance Detection for Spanish and Catalan languages |
|
Experimental |
| 52 |
romsto/Inappropriate-Language-Classifier
Online video games need a better system to detect inappropriate language in... |
|
Experimental |
| 53 |
neil-ab/contextual-hatespeech-detection
A contextual (BERT-based) hate speech detection model |
|
Experimental |
| 54 |
rajendranu4/stance-detection
Exploration of Contrastive Learning Strategies toward more Robust Stance Detection |
|
Experimental |