verazuo/jailbreak_llms

[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).

/ 100

Emerging

This project provides a comprehensive collection of over 15,000 real-world prompts, including 1,400+ 'jailbreak' prompts designed to bypass AI safety filters. It helps AI safety researchers and developers understand how users try to elicit harmful content from large language models. The input is a dataset of prompts, and the output is insights into common jailbreaking techniques and a structured question set for evaluation.

3,596 stars. No commits in the last 6 months.

Use this if you are an AI safety researcher, LLM developer, or policy maker seeking to analyze, understand, and defend against methods used to bypass large language model safeguards.

Not ideal if you are looking for a tool to generate harmful content or for general prompt engineering resources that do not focus on security vulnerabilities.

AI safety LLM security prompt engineering research content moderation harmful content detection

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

3,596

Forks

320

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

0xk1h0/ChatGPT_DAN

ChatGPT DAN, Jailbreaks prompt

Batlez/ChatGPT-Jailbreak-Pro

The ultimate ChatGPT Jailbreak Tool with stunning themes, categorized prompts, and a...

Techiral/GPT-Jailbreak

This repository contains the jailbreaking process for GPT-3, GPT-4, GPT-3.5, ChatGPT, and...

arinze1/ChatGPT-Jailbreaks-GIT

ChatGPT and Google AI Studio

BirdsAreFlyingCameras/GPT-5_Jailbreak_PoC

A working POC of a GPT-5 jailbreak via PROMISQROUTE (Prompt-based Router Open-Mode Manipulation)...

Explore LLM Tools

All categories Trending LLM Tool directory Insights