wesg52/universal-neurons

Universal Neurons in GPT2 Language Models

/ 100

Emerging

This project helps researchers and scientists understand the inner workings of large language models like GPT-2 by providing tools to analyze individual 'neurons'. It takes precomputed activation and weight data from these models as input and generates summarized statistics about neuron behavior and their connections to each other, to attention heads, and to vocabulary. The primary users are researchers studying interpretability and mechanistic understanding of neural networks.

No commits in the last 6 months.

Use this if you are a machine learning researcher aiming to explore and analyze the functions of individual neurons within GPT-2 language models to understand their contributions to the model's overall behavior.

Not ideal if you are looking to train new language models, fine-tune existing ones for specific tasks, or generate text directly, as this tool focuses on analyzing model internals rather than application.

AI-interpretability mechanistic-interpretability GPT-2-analysis neural-network-research language-model-science

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

filipnaudot/llmSHAP

llmSHAP: a multi-threaded explainability framework using Shapley values for LLM-based outputs.

microsoft/automated-brain-explanations

Generating and validating natural-language explanations for the brain.

CAS-SIAT-XinHai/CPsyCoun

[ACL 2024] CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework...

ICTMCG/LLM-for-misinformation-research

Paper list of misinformation research using (multi-modal) large language models, i.e., (M)LLMs.

marcusm117/IdentityChain

[ICLR 2024] Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with IdentityChain

Explore LLM Tools

All categories Trending LLM Tool directory Insights