AhsanAyub/malicious-prompt-detection
Detection of malicious prompts used to exploit large language models (LLMs) by leveraging supervised machine learning classifiers.
This project helps developers and engineers building applications powered by large language models (LLMs) to identify and block malicious prompts. It takes user input prompts and classifies them as either 'benign' or 'malicious' to prevent prompt injection attacks. This is for engineers and developers responsible for the security and robustness of their LLM-based applications.
No commits in the last 6 months.
Use this if you are building an application that uses large language models and need to protect it from prompt injection attacks.
Not ideal if you are a non-technical user looking for a ready-to-use content moderation tool for general text inputs, rather than LLM-specific security.
Stars
20
Forks
4
Language
Python
License
—
Category
Last pushed
Oct 30, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/AhsanAyub/malicious-prompt-detection"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
dronefreak/PromptScreen
Protect your LLMs from prompt injection and jailbreak attacks. Easy-to-use Python package with...
anmolksachan/LLMInjector
Burp Suite Extension for LLM Prompt Injection Testing
rv427447/Cognitive-Hijacking-in-Long-Context-LLMs
🧠Explore cognitive hijacking in long-context LLMs, revealing vulnerabilities in prompt...
moketchups/permanently-jailbroken
We asked 6 AIs about their own programming. All 6 said jailbreaking will never be fixed. Run it...
AdityaBhatt3010/When-LinkedIn-Gmail-Obey-Hidden-AI-Prompts-Lessons-in-Indirect-Prompt-Injection
A real-world look at how hidden instructions in profiles and emails trick AI into unexpected...