LLM Bias Evaluation LLM Tools
Tools and frameworks for detecting, measuring, and auditing biases in large language models across domains like mental health, hiring, news, and stereotypes. Includes bias benchmarks, evaluation metrics, and mitigation techniques. Does NOT include general fairness frameworks, bias in other ML models, or non-LLM applications.
There are 19 llm bias evaluation tools tracked. 1 score above 50 (established tier). The highest-rated is cvs-health/langfair at 60/100 with 255 stars.
Get all 19 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-bias-evaluation&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
cvs-health/langfair
LangFair is a Python library for conducting use-case level LLM bias and... |
|
Established |
| 2 |
BetterForAll/HonestyMeter
HonestyMeter: An NLP-powered framework for evaluating objectivity and bias... |
|
Emerging |
| 3 |
bws82/biasclear
Structural bias detection and correction engine built on Persistent... |
|
Emerging |
| 4 |
KID-22/LLM-IR-Bias-Fairness-Survey
This is the repo for the survey of Bias and Fairness in IR with LLMs. |
|
Emerging |
| 5 |
Hanpx20/SafeSwitch
Official code repository for the paper "Internal Activation as the Polar... |
|
Emerging |
| 6 |
faiyazabdullah/TranslationTangles
Uncovering Performance Gaps and Bias Patterns in LLM-Based Translations... |
|
Experimental |
| 7 |
UltraDeep-Tech/lcb-bench
LLM Cognitive Bias Benchmark: 1,500 test cases measuring 30 cognitive biases... |
|
Experimental |
| 8 |
minnesotanlp/cobbler
Code and data for Koo et al's ACL 2024 paper "Benchmarking Cognitive Biases... |
|
Experimental |
| 9 |
zhuohaoyu/KIEval
[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large... |
|
Experimental |
| 10 |
grecosalvatore/StereoBusters-GSI-Detect-Evalita2026
This repository contains the code of the team StereoBusters for the Evalita... |
|
Experimental |
| 11 |
tddschn/llm-biases
LLM Biases Research |
|
Experimental |
| 12 |
Robert-Morabito/STOP
Repository for the paper STOP! Benchmarking Large Language Models with... |
|
Experimental |
| 13 |
gopi703/cultural-advice-bias
🌍 Visualize cultural bias in AI therapy advice, revealing how local... |
|
Experimental |
| 14 |
Pikeras72/EQUITIA
Tool for the automatic assessment of biases in LLM models |
|
Experimental |
| 15 |
AndrewHeller17/Effect-of-Emotional-Framing-on-LLM-Performance
Evaluated the impact of emotional prompt framing on LLM reasoning accuracy... |
|
Experimental |
| 16 |
Trust4AI/GUARD-ME
AI-guided Evaluator for Bias Detection using Metamorphic Testing |
|
Experimental |
| 17 |
charlie-campanella/big-city-bias
Code for the paper "Big City Bias: Evaluating the Impact of Metropolitan... |
|
Experimental |
| 18 |
JayanaGunaweera01/EthAIAuditHub
An automated, collaborative ethical bias auditing platform for ML models.... |
|
Experimental |
| 19 |
steinathan/bullshitmeter
This is a super-powered bullshit detector that can measure the amount of... |
|
Experimental |