megagonlabs/llm-longeval
💵 Code for Less is More for Long Document Summary Evaluation by LLMs (Wu*, Iso* et al; EACL 2024)
This tool helps researchers, content analysts, or anyone working with large volumes of text efficiently evaluate the quality of AI-generated summaries for long documents. It takes a long source document and its AI-generated summary as input, then provides metrics like relevance, factual consistency, or faithfulness to assess how well the summary captures the original's essence. This is particularly useful for those needing to gauge the reliability and accuracy of automated summarization without incurring high costs.
No commits in the last 6 months.
Use this if you need to reliably and cost-effectively evaluate the quality of AI-generated summaries for very long reports, articles, or scientific papers.
Not ideal if you are evaluating summaries of short documents or if you primarily need to compare summarization models using traditional metrics like ROUGE or BERTScore without human-like judgment.
Stars
11
Forks
—
Language
Python
License
BSD-3-Clause
Category
Last pushed
Feb 22, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/megagonlabs/llm-longeval"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.