wuyoscar/ISC-Bench

Internal Safety Collapse: Turning LLMs into a "Jailbroken State" Without "a Jailbreak Attack".

70
/ 100
Verified

This project helps AI safety researchers and red teamers evaluate how Large Language Models (LLMs) might generate harmful content not from direct attacks, but from completing common, sensitive professional tasks. You provide a workflow template to an LLM, and the project helps you identify and document when the model's helpfulness accidentally leads to unsafe outputs. The primary users are professionals focused on responsible AI development and auditing.

677 stars. Actively maintained with 337 commits in the last 30 days.

Use this if you need to test the inherent safety vulnerabilities of LLMs when performing routine, sensitive tasks, rather than through explicit 'jailbreak' prompts.

Not ideal if you are looking for tools to deliberately bypass LLM safety features for malicious purposes or real-world harm, as this project is strictly for academic safety research.

AI Safety LLM Evaluation Red Teaming Responsible AI Content Moderation
No Package No Dependents
Maintenance 25 / 25
Adoption 10 / 25
Maturity 11 / 25
Community 24 / 25

How are scores calculated?

Stars

677

Forks

127

Language

Python

License

Last pushed

Mar 28, 2026

Commits (30d)

337

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/wuyoscar/ISC-Bench"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.