thunlp/OpenBackdoor
An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)
This toolkit helps machine learning engineers and researchers assess the security and robustness of natural language processing (NLP) models. It allows you to simulate 'backdoor attacks' where malicious hidden triggers can manipulate model behavior, and then test various defense strategies. You input an NLP model and a dataset, and the toolkit helps you create poisoned data, launch attacks, and evaluate how well the model resists or how effective a defense is.
200 stars. No commits in the last 6 months.
Use this if you are developing or deploying NLP models and need to rigorously test their vulnerability to textual backdoor attacks and benchmark defense mechanisms.
Not ideal if you are looking for a general-purpose NLP development library or a tool for general data poisoning without a focus on backdoor attacks in text.
Stars
200
Forks
27
Language
Python
License
Apache-2.0
Category
Last pushed
Apr 10, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/thunlp/OpenBackdoor"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
thunlp/OpenAttack
An Open-Source Package for Textual Adversarial Attack.
thunlp/TAADpapers
Must-read Papers on Textual Adversarial Attack and Defense
jind11/TextFooler
A Model for Natural Language Attack on Text Classification and Inference
thunlp/SememePSO-Attack
Code and data of the ACL 2020 paper "Word-level Textual Adversarial Attacking as Combinatorial...
osoleve/glitchlings
Enemies for your LLM