yueliu1999/Awesome-Jailbreak-on-LLMs

Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, datasets, evaluations, and analyses.

/ 100

Established

This resource provides a curated collection of techniques for evaluating and improving the safety of Large Language Models (LLMs). It includes research papers, code, and datasets related to both 'jailbreak' attacks (attempts to bypass safety mechanisms) and defenses against them. AI safety researchers and practitioners who are building or deploying LLMs would use this to understand vulnerabilities and develop more robust, responsible AI systems.

1,245 stars. Actively maintained with 14 commits in the last 30 days.

Use this if you are a researcher or engineer focused on understanding, testing, and hardening the safety mechanisms of Large Language Models against adversarial exploits.

Not ideal if you are looking for a plug-and-play tool for general LLM development or for simple prompt engineering.

AI-safety LLM-security AI-ethics adversarial-AI responsible-AI

No Package No Dependents

Maintenance 17 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

1,245

Forks

101

Language

—

License

MIT

Compare

Awesome-Jailbreak-on-LLMs and awesome-llm-jailbreaks

Related tools

wuyoscar/ISC-Bench

Internal Safety Collapse: Turning LLMs into a "Jailbroken State" Without "a Jailbreak Attack".

yiksiu-chan/SpeakEasy

[ICML 2025] Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions

xirui-li/DrAttack

Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes...

tmlr-group/DeepInception

[arXiv:2311.03191] "DeepInception: Hypnotize Large Language Model to Be Jailbreaker"

Techiral/awesome-llm-jailbreaks

Latest AI Jailbreak Payloads & Exploit Techniques for GPT, QWEN, and all LLM Models

Explore LLM Tools

All categories Trending LLM Tool directory Insights