poppingtonic/transformer-visualization

Mechanistic Interpretability Tutorials, Results and research log as I learn from publicly available research, and experimentation. Evolving work, open ended, slow updates. Lots of incomplete work.

/ 100

Experimental

This project helps AI researchers and students understand how large language models like Transformers make decisions. It takes a trained Transformer model and allows you to visualize the internal processing of individual tokens, revealing the 'why' behind its outputs. It also provides pre-generated datasets of specific sentence structures (like Indirect Object Identification) for focused interpretability studies.

No commits in the last 6 months.

Use this if you are a machine learning researcher or student focused on understanding the internal mechanisms of Transformer models, specifically for tasks like token processing and identifying 'induction heads'.

Not ideal if you are looking for a general-purpose model explanation tool for non-Transformer models or production-ready explainable AI (XAI) solutions for end-users.

AI interpretability Transformer models NLP research Mechanistic interpretability Large language models

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

jessevig/bertviz

BertViz: Visualize Attention in Transformer Models

inseq-team/inseq

Interpretability for sequence generation models 🐛 🔍

EleutherAI/knowledge-neurons

A library for finding knowledge neurons in pretrained transformer models.

hila-chefer/Transformer-MM-Explainability

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for...

cdpierse/transformers-interpret

Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model...

Explore Transformer Models

All categories Trending Transformer directory Insights