leap-laboratories/PIZZA

An attribution library for LLMs

/ 100

Emerging

This project helps anyone working with Large Language Models (LLMs) understand exactly which words or phrases in their prompt are most influential in shaping the model's response. You provide your prompt and an LLM's generated output, and it shows you a detailed breakdown of how each input token contributed to the generated response. This is ideal for AI product managers, researchers, or anyone debugging LLM behavior.

No commits in the last 6 months.

Use this if you need to understand the 'why' behind an LLM's output by dissecting the impact of individual prompt elements.

Not ideal if you are looking for a tool to train LLMs or optimize their performance without needing to interpret their internal workings.

LLM-explanation AI-interpretability prompt-engineering AI-debugging natural-language-processing

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

MadryLab/context-cite

Attribute (or cite) statements generated by LLMs back to in-context information.

microsoft/augmented-interpretable-models

Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.

Trustworthy-ML-Lab/CB-LLMs

[ICLR 25] A novel framework for building intrinsically interpretable LLMs with...

poloclub/LLM-Attributor

LLM Attributor: Attribute LLM's Generated Text to Training Data

THUDM/LongCite

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Explore Transformer Models

All categories Trending Transformer directory Insights