harvey-fin/absence-bench

Code implementation for paper AbsenceBench: Language Models Can't Tell What's Missing

/ 100

Experimental

This project helps AI researchers and developers evaluate how well large language models (LLMs) can detect missing information from long texts. It takes a pre-processed dataset containing texts with intentionally omitted details (like poetry or GitHub pull requests) and uses it to test various LLMs via API calls. The output shows how accurately each LLM identifies what's absent, rather than just off-topic content.

Use this if you are a researcher or AI developer who needs a standardized benchmark to assess and compare different LLMs' ability to notice genuinely missing information in lengthy inputs.

Not ideal if you're looking for a tool to find missing data in your own unstructured text for a business application, as this is a research-focused evaluation benchmark.

AI-research LLM-evaluation natural-language-processing model-benchmarking text-analysis

No License No Package No Dependents

Maintenance 6 / 25

Adoption 6 / 25

Maturity 7 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

google/langfun

OO for LLMs

tanaos/artifex

Small Language Model Inference, Fine-Tuning and Observability. No GPU, no labeled data needed.

preligens-lab/textnoisr

Adding random noise to a text dataset, and controlling very accurately the quality of the result

vulnerability-lookup/VulnTrain

A tool to generate datasets and models based on vulnerabilities descriptions from @Vulnerability-Lookup.

masakhane-io/masakhane-mt

Machine Translation for Africa

Explore NLP Tools

All categories Trending NLP directory Insights