itsqyh/Awesome-LMMs-Mechanistic-Interpretability
A curated collection of resources focused on the Mechanistic Interpretability (MI) of Large Multimodal Models (LMMs). This repository aggregates surveys, blog posts, and research papers that explore how LMMs represent, transform, and align multimodal information internally.
This collection helps AI researchers and practitioners understand how Large Multimodal Models (LMMs) work internally. It gathers surveys, blog posts, and research papers, providing insights into how these complex models process and link different types of information like images and text. This resource is for anyone working on or studying the inner workings of AI models, especially those focused on their transparency and trustworthiness.
192 stars.
Use this if you are researching or developing Large Multimodal Models and need to explore how they make decisions or represent information internally.
Not ideal if you are looking for ready-to-use code or tools to apply LMMs for practical tasks without needing to understand their internal mechanics.
Stars
192
Forks
5
Language
—
License
—
Category
Last pushed
Mar 04, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/itsqyh/Awesome-LMMs-Mechanistic-Interpretability"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
MadryLab/context-cite
Attribute (or cite) statements generated by LLMs back to in-context information.
microsoft/augmented-interpretable-models
Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.
Trustworthy-ML-Lab/CB-LLMs
[ICLR 25] A novel framework for building intrinsically interpretable LLMs with...
poloclub/LLM-Attributor
LLM Attributor: Attribute LLM's Generated Text to Training Data
THUDM/LongCite
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA