zhaochen0110/LMLM

Code and data for "Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change" (EMNLP2022)

/ 100

Experimental

This project helps researchers and data scientists working with language models that encounter issues with "temporal drift." It takes a pre-trained language model and unlabeled text data from a specific time period, along with labeled data for a downstream task, to produce an adapted language model. This adapted model is better at understanding language as it evolves over time, improving performance on tasks with changing vocabulary or word meanings.

No commits in the last 6 months.

Use this if your language model's performance degrades on data from different time periods due to words changing their meaning or usage.

Not ideal if your primary concern is improving language model performance without considering temporal shifts in language.

natural-language-processing temporal-modeling text-analysis data-science research

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 5 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

MadryLab/context-cite

Attribute (or cite) statements generated by LLMs back to in-context information.

microsoft/augmented-interpretable-models

Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.

Trustworthy-ML-Lab/CB-LLMs

[ICLR 25] A novel framework for building intrinsically interpretable LLMs with...

poloclub/LLM-Attributor

LLM Attributor: Attribute LLM's Generated Text to Training Data

THUDM/LongCite

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Explore Transformer Models

All categories Trending Transformer directory Insights