amazon-science/bold
Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper
This dataset helps evaluate fairness in English-language text generation systems by providing nearly 24,000 text prompts. You input these prompts into your language model, and then analyze its generated text to measure biases related to gender, race, profession, and religious or political ideologies. AI ethicists, machine learning researchers, and product managers developing AI-powered writing tools would use this to ensure responsible AI development.
No commits in the last 6 months.
Use this if you need a standardized collection of prompts to test your language model for unintended biases across sensitive social dimensions.
Not ideal if you are looking for a tool to automatically fix biases or if you need to evaluate bias in languages other than English.
Stars
87
Forks
15
Language
—
License
—
Category
Last pushed
Mar 02, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/amazon-science/bold"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
dccuchile/wefe
WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes...
dreji18/Fairness-in-AI
Detecting Bias and ensuring Fairness in AI solutions
dhfbk/variationist
Variationist: Exploring Multifaceted Variation and Bias in Written Language Data (ACL 2024 demo track)
soarsmu/BiasFinder
BiasFinder | IEEE TSE | Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems
microsoft/SafeNLP
Safety Score for Pre-Trained Language Models