csm9493/efficient-llm-unlearning
Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs (ICLR 2025)
This project helps machine learning engineers and researchers effectively remove specific, undesirable information from large language models (LLMs) without completely retraining them. You input a pre-trained LLM and the data you want it to "forget," and it outputs a modified LLM that no longer retains that specific knowledge, along with metrics to evaluate the unlearning process. This is ideal for those responsible for maintaining model compliance or updating knowledge in deployed LLMs.
No commits in the last 6 months.
Use this if you need to efficiently remove specific training data or facts from a large language model to address privacy concerns, correct misinformation, or update outdated information.
Not ideal if you're not working directly with training and fine-tuning large language models or if you need to completely erase a broad category of information that would require extensive retraining.
Stars
13
Forks
1
Language
Python
License
MIT
Category
Last pushed
Apr 04, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/csm9493/efficient-llm-unlearning"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
stair-lab/mlhp
Machine Learning from Human Preferences
princeton-nlp/SimPO
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
uclaml/SPPO
The official implementation of Self-Play Preference Optimization (SPPO)
general-preference/general-preference-model
[ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment...
sail-sg/dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards