MilaNLProc/honest

A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.

/ 100

Emerging

This tool helps evaluate how much large language models (LLMs) might generate harmful or stereotypical text, especially concerning gender and LGBTQIA+ individuals. You input a language model and it outputs a quantitative 'HONEST' score, indicating the model's propensity to complete sentences in a hurtful way. This is for researchers and practitioners who develop or deploy LLMs and want to measure and mitigate bias.

No commits in the last 6 months. Available on PyPI.

Use this if you are developing, fine-tuning, or evaluating a large language model and need to systematically measure its potential for generating harmful or stereotypical content.

Not ideal if you are a general user interested in checking the bias of a pre-trained, off-the-shelf LLM without needing to integrate it into a development workflow.

AI ethics natural language processing bias detection model evaluation responsible AI

Stale 6m

Maintenance 0 / 25

Adoption 6 / 25

Maturity 25 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Featured in

Embeddings Are Easier Than Whatever You're Doing Instead You're Shipping AI You Can't Measure

Higher-rated alternatives

embeddings-benchmark/mteb

MTEB: Massive Text Embedding Benchmark

harmonydata/harmony

The Harmony Python library: a research tool for psychologists to harmonise data and...

yannvgn/laserembeddings

LASER multilingual sentence embeddings as a pip package

embeddings-benchmark/results

Data for the MTEB leaderboard

Hironsan/awesome-embedding-models

A curated list of awesome embedding models tutorials, projects and communities.

Explore Embedding Tools

All categories Trending Embeddings directory Insights