Bias Measurement Evaluation NLP Tools

Tools and datasets for detecting, measuring, and quantifying bias in NLP models and language systems. Includes benchmarks, metrics, and evaluation methods for assessing fairness across different demographic groups and intersectional categories. Does NOT include general bias mitigation techniques, debiasing methods without evaluation focus, or application-specific bias detection (e.g., hate speech or toxic comment detection).

There are 42 bias measurement evaluation tools tracked. 1 score above 50 (established tier). The highest-rated is dccuchile/wefe at 53/100 with 183 stars.

Get all 42 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=bias-measurement-evaluation&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	dccuchile/wefe WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework...	53	Established	183	Python
2	dreji18/Fairness-in-AI Detecting Bias and ensuring Fairness in AI solutions	43	Emerging	102	Jupyter Notebook
3	amazon-science/bold Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in...	42	Emerging	87	—
4	dhfbk/variationist Variationist: Exploring Multifaceted Variation and Bias in Written Language...	40	Emerging	10	Python
5	soarsmu/BiasFinder BiasFinder \| IEEE TSE \| Metamorphic Test Generation to Uncover Bias for...	37	Emerging	11	Jupyter Notebook
6	microsoft/SafeNLP Safety Score for Pre-Trained Language Models	33	Emerging	96	Python
7	CAMeL-Lab/gender-rewriting-shared-task Evaluation code and data for the gender rewriting shared task	29	Experimental	1	Python
8	jasonshaoshun/SAL code for "Spectral Removal of Guarded Attribute Information"	29	Experimental	7	Jupyter Notebook
9	grecosalvatore/nlpguard NLPGuard: A Framework for Mitigating the use of Protected Attributes in NLP	29	Experimental	5	Python
10	princeton-nlp/MABEL EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data"...	29	Experimental	38	Python
11	darenr/gender-bias Real-time Javascipt gender bias detector	29	Experimental	4	JavaScript
12	krangelie/bias-in-german-nlg Master thesis: Exploring bias in German NLG (GPT-3 & GerPT-2). Applies...	27	Experimental	16	Jupyter Notebook
13	feyzaakyurek/bbnli Bias Benchmark for Natural Language Inference. Code repo for the Findings of...	27	Experimental	15	Python
14	candacelax/bias-in-vision-and-language Code for paper "Measuring Social Biases in Grounded Vision and Language Embeddings"	26	Experimental	9	Shell
15	cs329yangzhong/WIKIBIAS Code and data for EMNLP2021 paper: WIKIBIAS: Detecting Multi-Span Subjective...	24	Experimental	4	Python
16	yipenglai/Wikipedia-Gender-Bias Measure gender bias in English Wikipedia biographies through text analysis in R	24	Experimental	4	R
17	sathvikn/word_embedding_bias Companion to my blog post: How Biases in Language get Perpetuated by Technology	23	Experimental	4	Jupyter Notebook
18	minnesotanlp/Quantifying-Annotation-Disagreement Official implementation of Wan et al's paper "Everyone's Voice Matters:...	22	Experimental	6	Jupyter Notebook
19	PieTempesti98/biases_in_hiring_decisions Review of the most studied biases in the hiring process made by Pietro...	21	Experimental	1	Jupyter Notebook
20	google-research-datasets/nlp-fairness-for-india Contains data resources to replicate results from the paper...	21	Experimental	12	—
21	groovychoons/GlobalBias The official repo for the GlobalBias dataset and associated paper: 'Who is...	20	Experimental	5	Jupyter Notebook
22	jasonshaoshun/AMSAL code for "Erasure of Unaligned Attributes from Neural Representations"	20	Experimental	7	Python
23	tinotavingeyi-droid/ubuntu-xai An open-source research platform for evaluating AI bias, fairness, and...	19	Experimental	—	TypeScript
24	CAMeL-Lab/gender-rewriting Code, models, and data for "User-Centric Gender Rewriting". NAACL 2022.	19	Experimental	3	Python
25	martinsjaavik/llm-bias-norwegian Master thesis on subtler biases	19	Experimental	1	Python
26	feyzaakyurek/bias-textgen Code for the paper "Challenges in Measuring Bias in Open-Ended Language...	19	Experimental	4	Python
27	venkatasg/interpersonal-bias Code and data for the paper ' How people talk about each other: Modeling...	18	Experimental	2	Jupyter Notebook
28	Ahmad-AlSubaie/CS499-DL-debaising Repository for research done into the methods used to debias ML models....	18	Experimental	2	Jupyter Notebook
29	VSteinborn/s_jsd-multilingual-bias Code and data for the paper "An Information-Theoretic Approach and Dataset...	18	Experimental	5	Python
30	iamshnoo/soc_bias Reproduction for NAACL paper on Socially Aware Bias Measurements for Hindi	17	Experimental	1	Python
31	VSteinborn/politeness-attacks Code and data for the paper "Politeness Stereotypes and Attack Vectors:...	17	Experimental	1	Python
32	asimokby/formality-bias-analysis This repo contains the annotations and other artifacts of the paper titled:...	17	Experimental	1	—
33	erica-dessi/Modelli-linguistici-e-discriminazione-nascosta-il-bias-di-genere-nelle-professioni La presente tesi esplora il fenomeno del bias di genere nei Large Language...	15	Experimental	—	—
34	iampeti/Thesis_Gender_Bias 📊 Investigate gender bias in clinical research through statistical analysis...	14	Experimental	—	R
35	hyoungjo/lipstick-on-a-pig Debiasing methods on contextualised embeddings are ineffective - CS475	13	Experimental	—	Jupyter Notebook
36	ShamikRoy/Moral-Role-Prediction This repository contains the dataset and codes for the task of Morality...	12	Experimental	5	—
37	spidersouris/GeNRe [ACL 2025 Findings] GeNRe: A French Gender-Neutral Rewriting System Using...	11	Experimental	1	Python
38	B-VARUN-REDDY/FairwAI-Bias-Detection Submission for the FairwAI Hospitality Intern Challenge. This project...	11	Experimental	—	Python
39	thesofakillers/badder-seeds Official repository for the paper "[Re] Badder Seeds: Reproducing the...	11	Experimental	4	Python
40	sunyam/bias-literary-classification Measuring the Effects of Bias in Training Data for Literary Classification	11	Experimental	3	Jupyter Notebook
41	koc-lab/legalbias This repository contains the required codes for reproducing the results in...	11	Experimental	3	Python
42	Carolinecasey17/Thesis_NLP_GenderBias_AustralianJobDescriptions Scripts for Cognitive Science Masters Thesis - Investigating Implicit Gender...	10	Experimental	2	Jupyter Notebook