LeonEricsson/llmcontext
:anger: Pressure testing the context window of open LLMs
This project helps developers and researchers understand how well various open-source large language models (LLMs) can find a specific piece of information hidden within a very long text. You input an LLM, a long text with a 'needle' fact inside, and a question. The output is a score indicating how accurately the LLM retrieved the 'needle' and visualizations showing performance across different text lengths and fact locations. Anyone working with or choosing open-source LLMs for tasks requiring long-context understanding would use this.
No commits in the last 6 months.
Use this if you need to evaluate the long-context retrieval capabilities of open-source LLMs before deploying them for information extraction or question-answering on lengthy documents.
Not ideal if you are looking for a general-purpose LLM evaluation tool or if your primary concern is text generation quality rather than precise information retrieval from long contexts.
Stars
25
Forks
1
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Aug 25, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/LeonEricsson/llmcontext"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PacktPublishing/Mastering-NLP-from-Foundations-to-LLMs
Mastering NLP from Foundations to LLMs, Published by Packt
HandsOnLLM/Hands-On-Large-Language-Models
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
mlabonne/llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
louisfb01/start-llms
A complete guide to start and improve your LLM skills in 2026 with little background in the...
Denis2054/Transformers-for-NLP-and-Computer-Vision-3rd-Edition
Transformers 3rd Edition