megagonlabs/llm-longeval

💵 Code for Less is More for Long Document Summary Evaluation by LLMs (Wu*, Iso* et al; EACL 2024)

21
/ 100
Experimental

This tool helps researchers, content analysts, or anyone working with large volumes of text efficiently evaluate the quality of AI-generated summaries for long documents. It takes a long source document and its AI-generated summary as input, then provides metrics like relevance, factual consistency, or faithfulness to assess how well the summary captures the original's essence. This is particularly useful for those needing to gauge the reliability and accuracy of automated summarization without incurring high costs.

No commits in the last 6 months.

Use this if you need to reliably and cost-effectively evaluate the quality of AI-generated summaries for very long reports, articles, or scientific papers.

Not ideal if you are evaluating summaries of short documents or if you primarily need to compare summarization models using traditional metrics like ROUGE or BERTScore without human-like judgment.

document-summarization content-evaluation natural-language-processing research-analysis AI-model-assessment
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

11

Forks

Language

Python

License

BSD-3-Clause

Last pushed

Feb 22, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/megagonlabs/llm-longeval"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.