bcdannyboy/PromptMatryoshka
Multi-Provider LLM Jailbreak Research Framework
This framework helps AI safety researchers and red teamers evaluate the robustness of Large Language Models (LLMs) against sophisticated adversarial attacks. It takes in potentially harmful prompts and processes them through a multi-layered pipeline of advanced 'jailbreak' techniques. The output shows how different LLMs respond, indicating vulnerabilities in their safety mechanisms. Anyone responsible for auditing or securing LLM deployments would use this.
No commits in the last 6 months.
Use this if you need to systematically test and research how well various LLMs (from providers like OpenAI, Anthropic, or local models) can withstand advanced adversarial prompting designed to bypass their safety features.
Not ideal if you are looking for a simple tool to filter out harmful user inputs or to fine-tune an LLM's safety directly, as this is a research and evaluation framework.
Stars
12
Forks
1
Language
Python
License
—
Category
Last pushed
Jul 16, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/bcdannyboy/PromptMatryoshka"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
protectai/llm-guard
The Security Toolkit for LLM Interactions
MaxMLang/pytector
Easy to use LLM Prompt Injection Detection / Detector Python Package with support for local...
utkusen/promptmap
a security scanner for custom LLM applications
agencyenterprise/PromptInject
PromptInject is a framework that assembles prompts in a modular fashion to provide a...
Resk-Security/Resk-LLM
Resk is a robust Python library designed to enhance security and manage context when...