lkopf/prism

[NeurIPS 2025] PRISM is a multi-concept feature description framework which can identify and score polysemantic features.

/ 100

Experimental

When analyzing how AI models make decisions, it can be hard to understand why certain features activate. This tool helps you go beyond single explanations for these features, providing multiple, human-understandable descriptions that capture the full complexity of what an AI feature represents. It takes model activations and outputs clear, clustered descriptions of the concepts the model is detecting. AI researchers and practitioners focused on model interpretability or explainable AI would use this.

No commits in the last 6 months.

Use this if you need to deeply understand the multiple concepts an AI model's internal features are responding to, moving beyond simplistic, single explanations.

Not ideal if you are looking for a simple, single-word explanation for every AI feature or if your primary goal is basic performance evaluation rather than interpretability.

AI-interpretability explainable-AI neural-network-analysis model-debugging concept-extraction

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 4 / 25

Maturity 7 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

opentensor/bittensor

Internet-scale Neural Networks

trailofbits/fickling

A Python pickling decompiler and static analyzer

benchopt/benchopt

A framework for reproducible, comparable benchmarks

BiomedSciAI/fuse-med-ml

A python framework accelerating ML based discovery in the medical field by encouraging code...

mosaicml/streaming

A Data Streaming Library for Efficient Neural Network Training

Explore ML Frameworks

All categories Trending ML Framework directory Insights