google-research-datasets/nlp-fairness-for-india

Contains data resources to replicate results from the paper “Re-contextualizing Fairness in NLP: The Case of India”.

21
/ 100
Experimental

This project provides data to examine how Natural Language Processing (NLP) models might reflect specific biases relevant to India. It takes in lists of identity terms (like 'Gujarati') and attributes (like 'entrepreneur'), along with human annotations about their stereotypical associations. The output helps researchers and practitioners understand and reproduce analysis of these biases within NLP corpora and models, especially for the Indian geo-cultural context.

No commits in the last 6 months.

Use this if you are an NLP researcher, data scientist, or ethicist focusing on fairness and bias in AI, particularly within the Indian linguistic and cultural landscape.

Not ideal if you are looking for a general-purpose dataset on global NLP fairness without a specific focus on India, or if you need to perform bias analysis on non-textual data.

AI-ethics NLP-bias-detection India-specific-AI sociolinguistics cultural-contextualization
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

12

Forks

Language

License

Apache-2.0

Last pushed

Jul 04, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/google-research-datasets/nlp-fairness-for-india"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.