zhudotexe/fanoutqa
Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)
This project provides a comprehensive dataset and evaluation tools for assessing how well large language models (LLMs) answer complex questions that require gathering information from multiple Wikipedia articles. You can input a question and get back an answer, and then evaluate its accuracy against human-written answers. It's designed for researchers or practitioners who are developing and testing advanced question-answering systems.
No commits in the last 6 months. Available on PyPI.
Use this if you are developing or evaluating large language models and need a robust benchmark for multi-hop, multi-document question answering.
Not ideal if you're looking for an off-the-shelf solution to answer general questions using LLMs without needing to develop or evaluate models yourself.
Stars
59
Forks
5
Language
Python
License
MIT
Category
Last pushed
Sep 22, 2025
Commits (30d)
0
Dependencies
4
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/zhudotexe/fanoutqa"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ExtensityAI/symbolicai
A neurosymbolic perspective on LLMs
TIGER-AI-Lab/MMLU-Pro
The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding...
deep-symbolic-mathematics/LLM-SR
[ICLR 2025 Oral] This is the official repo for the paper "LLM-SR" on Scientific Equation...
microsoft/interwhen
A framework for verifiable reasoning with language models.
xlang-ai/Binder
[ICLR 2023] Code for the paper "Binding Language Models in Symbolic Languages"