MilaNLProc/honest

A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.

46
/ 100
Emerging

This tool helps evaluate how much large language models (LLMs) might generate harmful or stereotypical text, especially concerning gender and LGBTQIA+ individuals. You input a language model and it outputs a quantitative 'HONEST' score, indicating the model's propensity to complete sentences in a hurtful way. This is for researchers and practitioners who develop or deploy LLMs and want to measure and mitigate bias.

No commits in the last 6 months. Available on PyPI.

Use this if you are developing, fine-tuning, or evaluating a large language model and need to systematically measure its potential for generating harmful or stereotypical content.

Not ideal if you are a general user interested in checking the bias of a pre-trained, off-the-shelf LLM without needing to integrate it into a development workflow.

AI ethics natural language processing bias detection model evaluation responsible AI
Stale 6m
Maintenance 0 / 25
Adoption 6 / 25
Maturity 25 / 25
Community 15 / 25

How are scores calculated?

Stars

21

Forks

4

Language

Python

License

MIT

Last pushed

Apr 08, 2025

Commits (30d)

0

Dependencies

3

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/MilaNLProc/honest"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.