cpldcpu/MisguidedAttention

A collection of prompts to challenge the reasoning abilities of large language models in presence of misguiding information

/ 100

Emerging

This project offers a collection of specially crafted questions designed to test how well large language models (LLMs) reason when faced with misleading information. It helps evaluate if an LLM can logically solve a problem as stated, or if it defaults to familiar, but incorrect, answers learned during training. The input is a 'trick question' prompt, and the output is the LLM's response, revealing its reasoning strengths and weaknesses. Anyone responsible for evaluating or implementing LLMs in critical applications would find this useful.

466 stars. No commits in the last 6 months.

Use this if you need to rigorously test and benchmark the reasoning and problem-solving capabilities of different large language models, especially when 'common sense' or 'pattern recognition' might lead them astray.

Not ideal if you are looking for a tool to generate prompts for general creative writing tasks or for basic information retrieval from LLMs.

LLM-evaluation AI-benchmarking cognitive-testing prompt-engineering AI-safety

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

466

Forks

Language

Python

License

CC0-1.0

Higher-rated alternatives

BoundaryML/baml

The AI framework that adds the engineering to prompt engineering (Python/TS/Ruby/Java/C#/Rust/Go...

deanpeters/product-manager-prompts

A repository of Generative AI prompts for product managers using agents such as ChatGPT, Claude, & Gemini

eudk/awesome-ai-tools

🔴 VERY LARGE AI TOOL LIST! 🔴 Curated list of AI Tools - Updated 2026

jujumilk3/leaked-system-prompts

Collection of leaked system prompts

legeling/PromptHub

一款开源、纯本地 Prompt ，Skill 管理工具，帮助你高效管理、版本控制和复用 Prompt，并一键分发skill | An open-source, local-first AI...

Explore Prompt Engineering Tools

All categories Trending Prompt Engineering directory Insights