UCSC-VLAA/vllm-safety-benchmark

[ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"

/ 100

Experimental

This project helps AI researchers and developers evaluate the safety and robustness of their Vision-Language Models (VLLMs). It takes a VLLM and a variety of challenging image and text datasets as input, then measures how well the VLLM handles out-of-distribution scenarios and red-teaming attacks. The output provides quantifiable metrics on a VLLM's safety performance, helping identify vulnerabilities before deployment.

No commits in the last 6 months.

Use this if you are developing or deploying Vision-Language Models and need to rigorously assess their behavior and safety under challenging and adversarial conditions.

Not ideal if you are looking for a general-purpose VLLM or a tool to generate adversarial attacks without a focus on safety benchmarking.

AI Safety Vision-Language Models Model Evaluation Adversarial Robustness AI Red Teaming

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 8 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

ExtensityAI/symbolicai

A neurosymbolic perspective on LLMs

TIGER-AI-Lab/MMLU-Pro

The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding...

deep-symbolic-mathematics/LLM-SR

[ICLR 2025 Oral] This is the official repo for the paper "LLM-SR" on Scientific Equation...

microsoft/interwhen

A framework for verifiable reasoning with language models.

zhudotexe/fanoutqa

Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language...

Explore Transformer Models

All categories Trending Transformer directory Insights