harvey-fin/absence-bench

Code implementation for paper AbsenceBench: Language Models Can't Tell What's Missing

19
/ 100
Experimental

This project helps AI researchers and developers evaluate how well large language models (LLMs) can detect missing information from long texts. It takes a pre-processed dataset containing texts with intentionally omitted details (like poetry or GitHub pull requests) and uses it to test various LLMs via API calls. The output shows how accurately each LLM identifies what's absent, rather than just off-topic content.

Use this if you are a researcher or AI developer who needs a standardized benchmark to assess and compare different LLMs' ability to notice genuinely missing information in lengthy inputs.

Not ideal if you're looking for a tool to find missing data in your own unstructured text for a business application, as this is a research-focused evaluation benchmark.

AI-research LLM-evaluation natural-language-processing model-benchmarking text-analysis
No License No Package No Dependents
Maintenance 6 / 25
Adoption 6 / 25
Maturity 7 / 25
Community 0 / 25

How are scores calculated?

Stars

18

Forks

Language

Python

License

Last pushed

Oct 23, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/harvey-fin/absence-bench"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.