yihedeng9/DuoGuard
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
This project helps developers and engineers ensure the safety and appropriateness of responses generated by large language models (LLMs) in multiple languages. It takes an LLM's output in various languages and determines if it contains unsafe or undesirable content. This is useful for AI developers, product managers, or content moderation teams building or deploying multilingual LLMs.
No commits in the last 6 months.
Use this if you need to automatically detect and flag unsafe content from your LLMs across multiple languages with high accuracy and efficiency.
Not ideal if you are looking for a tool to generate text or translate content, as its primary function is safety moderation.
Stars
32
Forks
4
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/yihedeng9/DuoGuard"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ethz-spylab/agentdojo
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
guardrails-ai/guardrails
Adding guardrails to large language models.
JasonLovesDoggo/caddy-defender
Caddy module to block or manipulate requests originating from AIs or cloud services trying to...
inkdust2021/VibeGuard
Uses just 1% memory while protecting 99% of your personal privacy.
deadbits/vigil-llm
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language...