SuperBruceJia/Awesome-LLM-Self-Consistency

Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models

/ 100

Emerging

To get reliable results from large language models, you need to understand how consistently they answer questions or complete tasks. This resource provides a curated collection of research papers and benchmarks focused on 'self-consistency' in LLMs. It helps researchers and AI practitioners evaluate and improve the dependability of their language models.

120 stars. No commits in the last 6 months.

Use this if you are a researcher or practitioner working with large language models and need to evaluate or improve their reliability in reasoning, factual accuracy, or logical coherence.

Not ideal if you are looking for an off-the-shelf tool or software to directly improve your LLM's consistency without delving into academic research.

AI research Natural Language Processing LLM evaluation AI model reliability Machine Learning engineering

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

120

Forks

Language

—

License

MIT

Higher-rated alternatives

MMMU-Benchmark/MMMU

This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal...

pat-jj/DeepRetrieval

[COLM’25] DeepRetrieval — 🔥 Training Search Agent by RLVR with Retrieval Outcome

lupantech/MathVista

MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts

x66ccff/liveideabench

[𝐍𝐚𝐭𝐮𝐫𝐞 𝐂𝐨𝐦𝐦𝐮𝐧𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬] 🤖💡 LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea...

ise-uiuc/magicoder

[ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct

Explore LLM Tools

All categories Trending LLM Tool directory Insights