LLM Interpretability & Explainability LLM Tools

Tools and frameworks for understanding, explaining, and visualizing how large language models make decisions through mechanistic analysis, post-hoc explanations, concept-based interpretability, and neuron-level attribution methods. Does NOT include general model evaluation, bias detection, hallucination mitigation, or knowledge editing.

There are 30 llm interpretability & explainability tools tracked. The highest-rated is filipnaudot/llmSHAP at 49/100 with 16 stars.

Get all 30 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-interpretability-explainability&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 filipnaudot/llmSHAP

llmSHAP: a multi-threaded explainability framework using Shapley values for...

49
Emerging
2 microsoft/automated-brain-explanations

Generating and validating natural-language explanations for the brain.

48
Emerging
3 CAS-SIAT-XinHai/CPsyCoun

[ACL 2024] CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and...

44
Emerging
4 wesg52/universal-neurons

Universal Neurons in GPT2 Language Models

39
Emerging
5 ICTMCG/LLM-for-misinformation-research

Paper list of misinformation research using (multi-modal) large language...

38
Emerging
6 marcusm117/IdentityChain

[ICLR 2024] Beyond Accuracy: Evaluating Self-Consistency of Code Large...

36
Emerging
7 shahriargolchin/DCQ

The official repository for the paper entitled "Data Contamination Quiz: A...

35
Emerging
8 Wang-ML-Lab/interpretable-foundation-models

[ICML 2024] Probabilistic Conceptual Explainers (PACE): Trustworthy...

31
Emerging
9 OpenMOSS/Say-I-Dont-Know

[ICML'2024] Can AI Assistants Know What They Don't Know?

30
Emerging
10 amazon-science/ContraCLM

[ACL 2023] Code for ContraCLM: Contrastive Learning For Causal Language Model

29
Experimental
11 OSU-NLP-Group/AttrScore

Code, datasets, models for the paper "Automatic Evaluation of Attribution by...

29
Experimental
12 MozerWang/DEMO

[ACL 2025 (Findings)] DEMO: Reframing Dialogue Interaction with Fine-grained...

26
Experimental
13 YuweiYin/SWI

SWI: Speaking with Intent in Large Language Models

26
Experimental
14 stefdesabbata/geospatial-mechanistic-interpretability

Geospatial Mechanistic Interpretability of Large Language Models

23
Experimental
15 12kimih/HiCUPID

[ACL 2025] Exploring the Potential of LLMs as Personalized Assistants:...

22
Experimental
16 Joe-b-20/CoreVital

Mechanistic interpretability toolkit for monitoring LLM internal health....

22
Experimental
17 DataScienceUIBK/llm-reranking-generalization-study

How Good are LLM-based Rerankers? Accepted at EMNLP Findings 2025

22
Experimental
18 AColonnaDistria/llm2sql-consistency-analysis

LLM-to-SQL analysis tool designed to quantify non-determinism behavior of...

21
Experimental
19 jiangjiechen/uncommongen

Resources for our ACL 2023 paper: "Say What You Mean! Large Language Models...

21
Experimental
20 Nearzero-S/Intuitive-MechInterp

Helping Humans Understand Our Processing

21
Experimental
21 youzhaozhao/LLM-Heuristic-Graph-Coloring

Exploring LLM-assisted design of graph coloring heuristics through ...

19
Experimental
22 DAMO-NLP-SG/LLM-argumentation

[ACL2024] Exploring the Potential of Large Language Models in Computational...

19
Experimental
23 GovAIx/QualityModulation

[Nature Communications] Linguistic features of AI mis/disinformation and the...

17
Experimental
24 armlynobinguar/LLM-XAI-Papers

A curated collection of research papers on explainability and...

17
Experimental
25 emanuelemessina/broken-morals

Moral copilot for high-stakes ethical decisions in business contexts

15
Experimental
26 3B-Group/ConvRe

🤖ConvRe🤯: An Investigation of LLMs’ Inefficacy in Understanding Converse...

14
Experimental
27 phvv-me/icip2025

Official Repository for Vision Language Model Interpretability with Concept...

13
Experimental
28 pvicinanza/llm_prompt_tuning_conspiracies

This repository provides the data and code needed to replicate "Semantic...

11
Experimental
29 ChuanMeng/SIP

Code for the CIKM 2023 long paper: System Initiative Prediction for...

11
Experimental
30 vbainwala/Benchmarking-LLMs-Indic-Languages

Benchmarking Study of Bloomz-560m, mBART-large, IndicBART on the Indic Languages

11
Experimental