catherinesyeh/attention-viz
Visualizing query-key interactions in language + vision transformers (VIS 2023)
This tool helps machine learning researchers understand how transformer models process information. It takes in existing language or vision transformer models and their outputs, then generates interactive visualizations of how different parts of the input (like words in a sentence or regions in an image) relate to each other within the model. This is for researchers who build or analyze advanced AI models and need to gain deeper insights into their internal mechanisms.
162 stars. No commits in the last 6 months.
Use this if you are a machine learning researcher working with transformer models and need to visualize and interpret their internal 'attention' mechanisms to improve model understanding and debug performance.
Not ideal if you are looking for a tool to train models, visualize standard performance metrics, or analyze models other than transformers.
Stars
162
Forks
22
Language
HTML
License
MIT
Category
Last pushed
May 05, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/catherinesyeh/attention-viz"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Lightning-AI/litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
liangyuwang/Tiny-DeepSpeed
Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library
microsoft/Text2Grad
🚀 Text2Grad: Converting natural language feedback into gradient signals for precise model...
huangjia2019/llm-gpt
From classic NLP to modern LLMs: building language models step by step. 异æ¥å›¾ä¹¦ï¼šã€Š GPT图解 å¤§æ¨¡åž‹æ˜¯æ€Žæ ·æž„å»ºçš„ã€‹-...
FareedKhan-dev/Building-llama3-from-scratch
LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's...