xmed-lab/TAM

[ICCV25 Oral] Token Activation Map to Visually Explain Multimodal LLMs

/ 100

Emerging

When analyzing what a multimodal AI model sees in an image or video, this tool helps you understand precisely why it generated certain words. It takes your image or video and the model's text output, then shows you exactly which parts of the visual input "activated" each word. This is useful for AI researchers or anyone needing to debug or interpret multimodal AI models.

180 stars.

Use this if you need to visualize and explain the exact visual evidence a multimodal large language model used to generate specific words or phrases.

Not ideal if you're looking for a tool to explain text-only AI models or if you don't need to understand the fine-grained visual reasoning behind multimodal AI outputs.

AI-explanation model-debugging multimodal-AI AI-interpretability computer-vision

No License No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 7 / 25

Community 8 / 25

How are scores calculated?

Stars

180

Forks

Language

Python

License

—

Higher-rated alternatives

jessevig/bertviz

BertViz: Visualize Attention in Transformer Models

inseq-team/inseq

Interpretability for sequence generation models 🐛 🔍

EleutherAI/knowledge-neurons

A library for finding knowledge neurons in pretrained transformer models.

hila-chefer/Transformer-MM-Explainability

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for...

cdpierse/transformers-interpret

Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model...

Explore Transformer Models

All categories Trending Transformer directory Insights