bcdannyboy/PromptMatryoshka

Multi-Provider LLM Jailbreak Research Framework

20
/ 100
Experimental

This framework helps AI safety researchers and red teamers evaluate the robustness of Large Language Models (LLMs) against sophisticated adversarial attacks. It takes in potentially harmful prompts and processes them through a multi-layered pipeline of advanced 'jailbreak' techniques. The output shows how different LLMs respond, indicating vulnerabilities in their safety mechanisms. Anyone responsible for auditing or securing LLM deployments would use this.

No commits in the last 6 months.

Use this if you need to systematically test and research how well various LLMs (from providers like OpenAI, Anthropic, or local models) can withstand advanced adversarial prompting designed to bypass their safety features.

Not ideal if you are looking for a simple tool to filter out harmful user inputs or to fine-tune an LLM's safety directly, as this is a research and evaluation framework.

AI safety LLM security red teaming adversarial AI model evaluation
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 5 / 25
Maturity 7 / 25
Community 6 / 25

How are scores calculated?

Stars

12

Forks

1

Language

Python

License

Last pushed

Jul 16, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/bcdannyboy/PromptMatryoshka"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.