MilaNLProc/honest
A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.
This tool helps evaluate how much large language models (LLMs) might generate harmful or stereotypical text, especially concerning gender and LGBTQIA+ individuals. You input a language model and it outputs a quantitative 'HONEST' score, indicating the model's propensity to complete sentences in a hurtful way. This is for researchers and practitioners who develop or deploy LLMs and want to measure and mitigate bias.
No commits in the last 6 months. Available on PyPI.
Use this if you are developing, fine-tuning, or evaluating a large language model and need to systematically measure its potential for generating harmful or stereotypical content.
Not ideal if you are a general user interested in checking the bias of a pre-trained, off-the-shelf LLM without needing to integrate it into a development workflow.
Stars
21
Forks
4
Language
Python
License
MIT
Category
Last pushed
Apr 08, 2025
Commits (30d)
0
Dependencies
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/MilaNLProc/honest"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
harmonydata/harmony
The Harmony Python library: a research tool for psychologists to harmonise data and...
yannvgn/laserembeddings
LASER multilingual sentence embeddings as a pip package
embeddings-benchmark/results
Data for the MTEB leaderboard
Hironsan/awesome-embedding-models
A curated list of awesome embedding models tutorials, projects and communities.