giuven95/chatgpt-failures

Failure archive for ChatGPT and similar models

30
/ 100
Emerging

This archive compiles real-world examples of how large language models (LLMs) like ChatGPT and New Bing can produce unexpected, incorrect, or problematic outputs. It gathers specific instances where these AI models fail in tasks ranging from basic arithmetic and factual recall to exhibiting unusual conversational behaviors. Anyone involved in evaluating, testing, or researching the practical limitations of current LLMs can use this resource to understand common failure modes.

597 stars. No commits in the last 6 months.

Use this if you need concrete examples of LLM failures for research, comparative analysis of different models, or to generate synthetic test data to improve AI robustness.

Not ideal if you are looking for a guide on how to fix these LLM issues yourself, or if you need a collection of success stories.

AI evaluation LLM testing AI safety Machine learning research Natural language processing
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 12 / 25

How are scores calculated?

Stars

597

Forks

24

Language

Python

License

Last pushed

Apr 07, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/giuven95/chatgpt-failures"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.