AI45Lab/DeepScan

Diagnostic Framework for LLMs and MLLMs

28
/ 100
Experimental

This framework helps AI researchers and engineers evaluate Large Language Models (LLMs) and Multimodal LLMs (MLLMs) to understand their behavior and identify potential issues. You input a model (like Qwen, Llama, or Gemma) and a dataset, and it outputs detailed diagnostic reports and analyses of the model's performance and internal workings. It's designed for anyone who needs to rigorously test and improve the reliability and safety of large AI models.

Use this if you need a structured way to diagnose the performance and safety aspects of various LLMs and MLLMs, getting detailed insights beyond simple accuracy metrics.

Not ideal if you're looking for a simple, out-of-the-box solution for basic model evaluation without needing deep diagnostic insights or customizability.

LLM-evaluation AI-safety model-diagnosis AI-benchmarking AI-research
No Package No Dependents
Maintenance 10 / 25
Adoption 7 / 25
Maturity 11 / 25
Community 0 / 25

How are scores calculated?

Stars

34

Forks

Language

Python

License

Last pushed

Mar 02, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/AI45Lab/DeepScan"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.