WolodjaZ/MSAE
Interpreting CLIP with Hierarchical Sparse Autoencoders (ICML 2025)
This project helps AI researchers and practitioners better understand how large vision-language models like CLIP interpret images and text. It takes pre-computed activations from these models and generates hierarchical, interpretable features that reveal the semantic concepts the model uses. This allows researchers to analyze model biases and perform concept-based similarity searches, ultimately leading to more controllable and explainable AI systems.
Use this if you need to extract and analyze understandable concepts from complex vision-language models to improve their interpretability and control.
Not ideal if you are primarily interested in general-purpose model training or fine-tuning without a specific focus on interpretability.
Stars
22
Forks
6
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Jan 17, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/WolodjaZ/MSAE"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
mlfoundations/open_clip
An open source implementation of CLIP.
noxdafox/clipspy
Python CFFI bindings for the 'C' Language Integrated Production System CLIPS
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
moein-shariatnia/OpenAI-CLIP
Simple implementation of OpenAI CLIP model in PyTorch.
BioMedIA-MBZUAI/FetalCLIP
Official repository of FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis