phvv-me/frame-representation-hypothesis

Official Repository for Frame Representation Hypothesis paper

/ 100

Emerging

This framework helps AI researchers and developers understand and control Large Language Models (LLMs). It takes WordNet data to generate concepts, which can then be used to guide an LLM's text generation or to expose potential biases and vulnerabilities within the model. The output helps ensure LLMs produce more predictable and reliable text, making it useful for those who fine-tune, evaluate, or deploy LLMs.

No commits in the last 6 months.

Use this if you need to gain deeper insights into how LLMs form their responses and want a method to steer their output toward specific concepts or identify undesirable behaviors.

Not ideal if you are an end-user simply looking to interact with a pre-trained LLM for general tasks, as this is a tool for LLM analysis and control, not direct user interaction.

LLM interpretability AI safety natural language generation model evaluation computational linguistics

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 4 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

MadryLab/context-cite

Attribute (or cite) statements generated by LLMs back to in-context information.

microsoft/augmented-interpretable-models

Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.

Trustworthy-ML-Lab/CB-LLMs

[ICLR 25] A novel framework for building intrinsically interpretable LLMs with...

poloclub/LLM-Attributor

LLM Attributor: Attribute LLM's Generated Text to Training Data

THUDM/LongCite

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Explore Transformer Models

All categories Trending Transformer directory Insights