salesforce/summary-of-a-haystack

Codebase accompanying the Summary of a Haystack paper.

/ 100

Emerging

This project helps researchers and developers evaluate how well large language models (LLMs) and Retrieval Augmented Generation (RAG) systems can summarize very long documents or conversations. You input large text documents (like news articles or conversation transcripts) and get back automatically generated summaries alongside evaluation scores. It's designed for AI researchers and machine learning engineers who need to benchmark and compare the performance of different summarization methods.

No commits in the last 6 months.

Use this if you are developing or comparing long-context LLMs and RAG systems and need a standardized way to measure their summarization capabilities on complex, lengthy texts.

Not ideal if you are looking for an out-of-the-box summarization tool for general use, as this project focuses on research and evaluation of underlying models.

AI-research LLM-benchmarking NLP-evaluation generative-AI information-retrieval

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

Apache-2.0

Related tools

TuanaCelik/unstructuredio-haystack

💙 Unstructured Data Connectors for Haystack 2.0

Explore Embedding Tools

All categories Trending Embeddings directory Insights