poloclub/llm-landscape
NeurIPS'24 - LLM Safety Landscape
This tool helps AI safety researchers and ML engineers understand how robust their large language models (LLMs) are to fine-tuning and other modifications. It takes your fine-tuned LLM and visualizes its 'safety landscape,' showing how much you can tweak the model's weights before its safety suddenly degrades. The output includes plots of this safety basin and a 'VISAGE score' indicating the model's safety robustness.
Use this if you are developing or deploying LLMs and need to rigorously assess the safety risks associated with finetuning, weight adjustments, or potential harmful attacks.
Not ideal if you are looking for a simple, out-of-the-box solution for general LLM safety scanning without deep technical analysis of model weights.
Stars
39
Forks
7
Language
Python
License
MIT
Category
Last pushed
Oct 21, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/poloclub/llm-landscape"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
sintel-dev/sigllm
Using Large Language Models for Time Series Anomaly Detection
guanwei49/LogLLM
LogLLM: Log-based Anomaly Detection Using Large Language Models (system log anomaly detection)
yangzhch6/AlignedCoT
Implementation of our paper "Speak Like a Native: Prompting Large Language Models in a Native Style"
CloudnetUCSC/VMFT-LAD
The source repository of "Virtual Machine Proactive Fault Tolerance using Log-based Anomaly...