zhudotexe/fanoutqa

Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)

44
/ 100
Emerging

This project provides a comprehensive dataset and evaluation tools for assessing how well large language models (LLMs) answer complex questions that require gathering information from multiple Wikipedia articles. You can input a question and get back an answer, and then evaluate its accuracy against human-written answers. It's designed for researchers or practitioners who are developing and testing advanced question-answering systems.

No commits in the last 6 months. Available on PyPI.

Use this if you are developing or evaluating large language models and need a robust benchmark for multi-hop, multi-document question answering.

Not ideal if you're looking for an off-the-shelf solution to answer general questions using LLMs without needing to develop or evaluate models yourself.

LLM evaluation question answering natural language processing research knowledge retrieval
Stale 6m
Maintenance 2 / 25
Adoption 8 / 25
Maturity 25 / 25
Community 9 / 25

How are scores calculated?

Stars

59

Forks

5

Language

Python

License

MIT

Last pushed

Sep 22, 2025

Commits (30d)

0

Dependencies

4

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/zhudotexe/fanoutqa"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.