jinzhuoran/RWKU

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024

/ 100

Experimental

This project helps AI researchers and machine learning engineers evaluate how effectively large language models (LLMs) can forget specific real-world information, such as facts about famous people. You provide an LLM and the knowledge you want it to forget (e.g., "Stephen King"), and the project measures if the model successfully unlearned that information without affecting its other abilities. The end-user persona is an AI researcher, LLM developer, or machine learning engineer focused on model safety and ethical AI.

No commits in the last 6 months.

Use this if you need a standardized benchmark to rigorously test and compare different knowledge unlearning methods for large language models.

Not ideal if you are looking for a tool to implement a knowledge unlearning technique rather than evaluate one.

LLM evaluation AI safety machine unlearning model privacy natural language processing

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 8 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

steering-vectors/steering-vectors

Steering vectors for transformer language models in Pytorch / Huggingface

jianghoucheng/AlphaEdit

AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025 (Outstanding Paper)

kmeng01/memit

Mass-editing thousands of facts into a transformer memory (ICLR 2023)

boyiwei/alignment-attribution-code

[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

jianghoucheng/AnyEdit

AnyEdit: Edit Any Knowledge Encoded in Language Models, ICML 2025

Explore Transformer Models

All categories Trending Transformer directory Insights