HKU-TASR/Imperio

[IJCAI 2024] Imperio is an LLM-powered backdoor attack. It allows the adversary to issue language-guided instructions to control the victim model's prediction for arbitrary targets.

34
/ 100
Emerging

This project helps security researchers and AI auditors understand a new type of vulnerability in machine learning models, specifically in image classification. It takes a clean image dataset and, using language-guided instructions, trains a 'backdoored' model. The output is a model that can be controlled to misclassify specific images based on text commands, while still performing accurately on normal inputs.

No commits in the last 6 months.

Use this if you are researching advanced backdoor attacks on image classification models and need a tool to create and evaluate language-guided backdoor vulnerabilities.

Not ideal if you are looking for a defensive tool to detect or mitigate existing backdoors, or if your focus is on NLP model vulnerabilities.

AI-security ML-auditing threat-modeling image-classification model-vulnerability
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 10 / 25

How are scores calculated?

Stars

44

Forks

4

Language

Python

License

MIT

Last pushed

Feb 18, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/HKU-TASR/Imperio"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.