salesforce/summary-of-a-haystack
Codebase accompanying the Summary of a Haystack paper.
This project helps researchers and developers evaluate how well large language models (LLMs) and Retrieval Augmented Generation (RAG) systems can summarize very long documents or conversations. You input large text documents (like news articles or conversation transcripts) and get back automatically generated summaries alongside evaluation scores. It's designed for AI researchers and machine learning engineers who need to benchmark and compare the performance of different summarization methods.
No commits in the last 6 months.
Use this if you are developing or comparing long-context LLMs and RAG systems and need a standardized way to measure their summarization capabilities on complex, lengthy texts.
Not ideal if you are looking for an out-of-the-box summarization tool for general use, as this project focuses on research and evaluation of underlying models.
Stars
80
Forks
5
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Sep 20, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/salesforce/summary-of-a-haystack"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.