lechmazur/deception

Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claude, GPT-4, Gemini, Llama, etc.) with standardized evaluation metrics.

21
/ 100
Experimental

This benchmark helps you understand how well large language models (LLMs) can create believable false information and how easily they can be tricked by misleading content. It takes recent articles and questions as input, then evaluates various LLMs to produce scores for both their deceptive capabilities and their resistance to disinformation. Anyone working with or deploying LLMs in sensitive areas, such as content moderation, AI safety, or information security, would find this valuable.

No commits in the last 6 months.

Use this if you need to assess the trustworthiness and reliability of different LLMs for tasks where accuracy is critical and misinformation is a concern.

Not ideal if you are looking to benchmark general LLM performance on tasks like creative writing or basic question-answering.

AI safety information integrity content moderation LLM evaluation disinformation research
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 6 / 25

How are scores calculated?

Stars

32

Forks

2

Language

License

Last pushed

Mar 20, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/lechmazur/deception"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.