amazon-science/bold

Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper

/ 100

Emerging

This dataset helps evaluate fairness in English-language text generation systems by providing nearly 24,000 text prompts. You input these prompts into your language model, and then analyze its generated text to measure biases related to gender, race, profession, and religious or political ideologies. AI ethicists, machine learning researchers, and product managers developing AI-powered writing tools would use this to ensure responsible AI development.

No commits in the last 6 months.

Use this if you need a standardized collection of prompts to test your language model for unintended biases across sensitive social dimensions.

Not ideal if you are looking for a tool to automatically fix biases or if you need to evaluate bias in languages other than English.

AI-ethics NLP-evaluation responsible-AI language-model-testing bias-detection

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

—

License

—

Higher-rated alternatives

dccuchile/wefe

WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes...

dreji18/Fairness-in-AI

Detecting Bias and ensuring Fairness in AI solutions

dhfbk/variationist

Variationist: Exploring Multifaceted Variation and Bias in Written Language Data (ACL 2024 demo track)

soarsmu/BiasFinder

BiasFinder | IEEE TSE | Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems

microsoft/SafeNLP

Safety Score for Pre-Trained Language Models

Explore NLP Tools

All categories Trending NLP directory Insights