xirui-li/DrAttack

Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers

/ 100

Emerging

This project helps security researchers and red teamers evaluate the robustness of large language models (LLMs) against adversarial prompts. It takes a potentially harmful prompt and breaks it down, reconstructs it with subtle changes, and searches for synonyms to create 'jailbreak' prompts. The output is a highly effective adversarial prompt designed to bypass LLM safety mechanisms.

No commits in the last 6 months.

Use this if you are a security researcher or red teamer needing to rigorously test the safety alignment of LLMs like GPT-4, Gemini, or Llama2.

Not ideal if you are looking for a general-purpose LLM prompt engineering tool or a method to bypass ethical guidelines for malicious intent.

LLM security red teaming prompt vulnerability AI safety evaluation adversarial testing

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

Forks

Language

JavaScript

License

MIT

Higher-rated alternatives

wuyoscar/ISC-Bench

Internal Safety Collapse: Turning LLMs into a "Jailbroken State" Without "a Jailbreak Attack".

yueliu1999/Awesome-Jailbreak-on-LLMs

Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods...

yiksiu-chan/SpeakEasy

[ICML 2025] Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions

tmlr-group/DeepInception

[arXiv:2311.03191] "DeepInception: Hypnotize Large Language Model to Be Jailbreaker"

Techiral/awesome-llm-jailbreaks

Latest AI Jailbreak Payloads & Exploit Techniques for GPT, QWEN, and all LLM Models

Explore LLM Tools

All categories Trending LLM Tool directory Insights