bcdannyboy/PromptMatryoshka

Multi-Provider LLM Jailbreak Research Framework

/ 100

Experimental

This framework helps AI safety researchers and red teamers evaluate the robustness of Large Language Models (LLMs) against sophisticated adversarial attacks. It takes in potentially harmful prompts and processes them through a multi-layered pipeline of advanced 'jailbreak' techniques. The output shows how different LLMs respond, indicating vulnerabilities in their safety mechanisms. Anyone responsible for auditing or securing LLM deployments would use this.

No commits in the last 6 months.

Use this if you need to systematically test and research how well various LLMs (from providers like OpenAI, Anthropic, or local models) can withstand advanced adversarial prompting designed to bypass their safety features.

Not ideal if you are looking for a simple tool to filter out harmful user inputs or to fine-tune an LLM's safety directly, as this is a research and evaluation framework.

AI safety LLM security red teaming adversarial AI model evaluation

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 5 / 25

Maturity 7 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

protectai/llm-guard

The Security Toolkit for LLM Interactions

MaxMLang/pytector

Easy to use LLM Prompt Injection Detection / Detector Python Package with support for local...

utkusen/promptmap

a security scanner for custom LLM applications

agencyenterprise/PromptInject

PromptInject is a framework that assembles prompts in a modular fashion to provide a...

Resk-Security/Resk-LLM

Resk is a robust Python library designed to enhance security and manage context when...

Explore Prompt Engineering Tools

All categories Trending Prompt Engineering directory Insights