Buyun-Liang/SECA
[NeurIPS 2025] SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations
This project helps evaluate the reliability of Large Language Models (LLMs) by identifying their tendencies to produce incorrect or fabricated information, known as hallucinations. It takes an original question or prompt and generates a slightly reworded, but semantically identical, version designed to provoke a hallucination from the LLM. The output is an altered prompt and the corresponding LLM response, highlighting any erroneous content. This is useful for researchers, AI safety engineers, or product managers who need to assess and improve the trustworthiness of LLM applications.
Use this if you need to test how robust an LLM is to minor phrasing changes and discover specific vulnerabilities that lead to factually incorrect or inconsistent answers.
Not ideal if you are looking to generate diverse, creative, or entirely new prompts, as this tool focuses on minimal, meaning-preserving alterations.
Stars
68
Forks
1
Language
Python
License
MIT
Category
Last pushed
Dec 10, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Buyun-Liang/SECA"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
THU-BPM/MarkLLM
MarkLLM: An Open-Source Toolkit for LLM Watermarking.(EMNLP 2024 System Demonstration)
git-disl/Vaccine
This is the official code for the paper "Vaccine: Perturbation-aware Alignment for Large...
zjunlp/Deco
[ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
HillZhang1999/ICD
Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced...
voidism/DoLa
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality...