rmovva/HypotheSAEs

HypotheSAEs: hypothesizing interpretable relationships in text datasets using sparse autoencoders. https://arxiv.org/abs/2502.04382

/ 100

Established

HypotheSAEs helps researchers and analysts uncover meaningful patterns in large text datasets, like why certain news headlines get more clicks or which political party a speech belongs to. You input your collection of texts alongside a target outcome (e.g., engagement metrics, political affiliation), and it outputs clear, human-readable explanations of concepts within the text that predict that outcome. This tool is designed for anyone working with textual data who needs to understand the underlying drivers behind observed trends or classifications.

Available on PyPI.

Use this if you have a dataset of texts and a related variable, and you want to discover specific concepts or themes within those texts that explain why that variable changes or behaves the way it does.

Not ideal if your text data offers no predictive signal for your target variable, or if you require interpretations for extremely long documents exceeding 500 words without prior chunking or summarization.

text-analytics market-research social-science content-strategy political-science

Maintenance 10 / 25

Adoption 9 / 25

Maturity 25 / 25

Community 20 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

Apache-2.0

Related tools

interpretml/interpret-text

A library that incorporates state-of-the-art explainers for text-based machine learning models...

fdalvi/NeuroX

A Python library that encapsulates various methods for neuron interpretation and analysis in...

jalammar/ecco

Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations...

alexdyysp/ESIM-pytorch

中国高校计算机大赛--大数据挑战赛

MultiplEYE-COST/wg1-experiment-implementation

In this repository we keep the code for the implementation of the eye-tracking experiment for...

Explore NLP Tools

All categories Trending NLP directory Insights