harvey-fin/absence-bench
Code implementation for paper AbsenceBench: Language Models Can't Tell What's Missing
This project helps AI researchers and developers evaluate how well large language models (LLMs) can detect missing information from long texts. It takes a pre-processed dataset containing texts with intentionally omitted details (like poetry or GitHub pull requests) and uses it to test various LLMs via API calls. The output shows how accurately each LLM identifies what's absent, rather than just off-topic content.
Use this if you are a researcher or AI developer who needs a standardized benchmark to assess and compare different LLMs' ability to notice genuinely missing information in lengthy inputs.
Not ideal if you're looking for a tool to find missing data in your own unstructured text for a business application, as this is a research-focused evaluation benchmark.
Stars
18
Forks
—
Language
Python
License
—
Category
Last pushed
Oct 23, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/harvey-fin/absence-bench"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
google/langfun
OO for LLMs
tanaos/artifex
Small Language Model Inference, Fine-Tuning and Observability. No GPU, no labeled data needed.
preligens-lab/textnoisr
Adding random noise to a text dataset, and controlling very accurately the quality of the result
vulnerability-lookup/VulnTrain
A tool to generate datasets and models based on vulnerabilities descriptions from @Vulnerability-Lookup.
masakhane-io/masakhane-mt
Machine Translation for Africa