tsinghua-fib-lab/ANeurIPS2024_SPV-MIA

[NeurIPS'24] "Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration"

/ 100

Emerging

This project helps evaluate the privacy risks associated with fine-tuned Large Language Models (LLMs). It takes a fine-tuned LLM and a dataset as input and determines whether specific data points from that dataset were used in the LLM's training process. This is for AI security researchers, privacy experts, and developers concerned with the data exposure of their models.

200 stars. No commits in the last 6 months.

Use this if you need to assess the privacy vulnerabilities of a fine-tuned LLM, specifically checking if specific training data can be inferred.

Not ideal if you are looking for general LLM development tools or methods to improve model performance rather than privacy auditing.

AI-security data-privacy LLM-auditing machine-learning-privacy membership-inference

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 16 / 25

How are scores calculated?

Stars

200

Forks

Language

Python

License

—

Higher-rated alternatives

HowieHwong/TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Intelligent-CAT-Lab/PLTranslationEmpirical

Artifact repository for the paper "Lost in Translation: A Study of Bugs Introduced by Large...

rishub-tamirisa/tamper-resistance

[ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"

FudanDISC/ReForm-Eval

An benchmark for evaluating the capabilities of large vision-language models (LVLMs)

codessian/epistemic-confidence-layer

Model-agnostic trust protocol for calibrated, auditable AI

Explore Transformer Models

All categories Trending Transformer directory Insights