tsinghua-fib-lab/ANeurIPS2024_SPV-MIA
[NeurIPS'24] "Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration"
This project helps evaluate the privacy risks associated with fine-tuned Large Language Models (LLMs). It takes a fine-tuned LLM and a dataset as input and determines whether specific data points from that dataset were used in the LLM's training process. This is for AI security researchers, privacy experts, and developers concerned with the data exposure of their models.
200 stars. No commits in the last 6 months.
Use this if you need to assess the privacy vulnerabilities of a fine-tuned LLM, specifically checking if specific training data can be inferred.
Not ideal if you are looking for general LLM development tools or methods to improve model performance rather than privacy auditing.
Stars
200
Forks
25
Language
Python
License
—
Category
Last pushed
Mar 13, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/tsinghua-fib-lab/ANeurIPS2024_SPV-MIA"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
HowieHwong/TrustLLM
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
Intelligent-CAT-Lab/PLTranslationEmpirical
Artifact repository for the paper "Lost in Translation: A Study of Bugs Introduced by Large...
rishub-tamirisa/tamper-resistance
[ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"
FudanDISC/ReForm-Eval
An benchmark for evaluating the capabilities of large vision-language models (LVLMs)
codessian/epistemic-confidence-layer
Model-agnostic trust protocol for calibrated, auditable AI