lechmazur/deception

Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claude, GPT-4, Gemini, Llama, etc.) with standardized evaluation metrics.

/ 100

Experimental

This benchmark helps you understand how well large language models (LLMs) can create believable false information and how easily they can be tricked by misleading content. It takes recent articles and questions as input, then evaluates various LLMs to produce scores for both their deceptive capabilities and their resistance to disinformation. Anyone working with or deploying LLMs in sensitive areas, such as content moderation, AI safety, or information security, would find this valuable.

No commits in the last 6 months.

Use this if you need to assess the trustworthiness and reliability of different LLMs for tasks where accuracy is critical and misinformation is a concern.

Not ideal if you are looking to benchmark general LLM performance on tasks like creative writing or basic question-answering.

AI safety information integrity content moderation LLM evaluation disinformation research

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

—

License

—

Higher-rated alternatives

TsinghuaC3I/MARTI

A Framework for LLM-based Multi-Agent Reinforced Training and Inference

zjunlp/KnowLM

An Open-sourced Knowledgable Large Language Model Framework.

cli99/llm-analysis

Latency and Memory Analysis of Transformer Models for Training and Inference

tanyuqian/redco

NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to...

stanleylsx/llms_tool

一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测，低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。

Explore Transformer Models

All categories Trending Transformer directory Insights