ByungKwanLee/Adversarial-Information-Bottleneck

[NeurIPS 2021] Official PyTorch Implementation for "Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck"

/ 100

Emerging

This project helps machine learning researchers and security analysts understand how image classification models perceive visual information, especially when dealing with 'adversarial examples.' It takes images and trained classification models as input, and outputs visualizations of what parts of an image the model considers 'robust' (essential for correct classification) versus 'non-robust' (easily manipulated by adversaries). This allows users to dissect model behavior and improve its resistance to malicious inputs.

No commits in the last 6 months.

Use this if you are a machine learning researcher or security specialist interested in the internal workings of image classification models, especially concerning their vulnerability to adversarial attacks and the interpretability of features.

Not ideal if you are looking for a simple, out-of-the-box solution to directly train a robust model without diving into the underlying feature decomposition and visualization.

Adversarial Robustness Model Interpretability Computer Vision Security Deep Learning Research Image Classification

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Related models

liuxuannan/Stochastic-Gradient-Aggregation

Official implementation of the ICCV2023 paper: Enhancing Generalization of Universal Adversarial...

Explore Diffusion Models

All categories Trending Diffusion directory Insights