eliahuhorwitz/Spectral-DeTuning
Official PyTorch Implementation for the "Recovering the Pre-Fine-Tuning Weights of Generative Models" paper (ICML 2024).
This tool helps AI security researchers and red teamers identify vulnerabilities in fine-tuned generative AI models. It takes multiple LoRA (Low-Rank Adaptation) fine-tuned models that originated from the same base model. The output is the recovered weights of the original, pre-fine-tuned source model, even if you don't have access to its low-rank decomposition.
No commits in the last 6 months.
Use this if you need to demonstrate or investigate how an attacker could recover the original, potentially unsafe, weights of a generative AI model after it has been fine-tuned for safety or other purposes.
Not ideal if you are looking to fine-tune models, optimize model performance, or perform standard model evaluation rather than a security analysis.
Stars
85
Forks
4
Language
Python
License
—
Category
Last pushed
Apr 15, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/eliahuhorwitz/Spectral-DeTuning"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
OptimalScale/LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
adithya-s-k/AI-Engineering.academy
Mastering Applied AI, One Concept at a Time
jax-ml/jax-llm-examples
Minimal yet performant LLM examples in pure JAX
young-geng/scalax
A simple library for scaling up JAX programs
riyanshibohra/TuneKit
Upload your data → Get a fine-tuned SLM. Free.