GitsSaikat/Guardian-Agent

Improving AI Systems with Self-Defense Mechanisms

/ 100

Emerging

This project helps AI system developers protect their AI models from malicious prompts that try to hijack or manipulate their behavior. It takes in user prompts and ensures the AI model only acts on legitimate, safe instructions, preventing it from being 'jailbroken' or manipulated. Developers building AI agents who want to maintain strict control over their AI's responses and actions would use this.

No commits in the last 6 months.

Use this if you are developing an AI agent and need to build in robust self-defense mechanisms against adversarial prompt attacks and jailbreaking attempts.

Not ideal if you are an end-user of an existing AI system, as this tool is for developers to integrate into their AI agent's architecture.

AI safety AI security prompt engineering AI agent development malicious prompt prevention

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Featured in

Agent Governance in 2026: Who's Building the Guardrails? Agent Platforms Are Four Problems, Not One Your Agent Doesn't Have an Email Address (Yet) Your Agent is Hitting its Ceiling — Who's Actually Fixing It

Higher-rated alternatives

ucsandman/DashClaw

🛡️Decision infrastructure for AI agents. Intercept actions, enforce guard policies, require...

Dicklesworthstone/destructive_command_guard

The Destructive Command Guard (dcg) is for blocking dangerous git and shell commands from being...

microsoft/agent-governance-toolkit

AI Agent Governance Toolkit — Policy enforcement, zero-trust identity, execution sandboxing, and...

vstorm-co/pydantic-ai-shields

Guardrail capabilities for Pydantic AI — cost tracking, prompt injection detection, PII...

Pro-GenAI/Agent-Action-Guard

🛡️ Safe AI Agents through Action Classifier

Explore AI Agents

All categories Trending AI Agent directory Insights