jennyzzt/LLM_debate_on_ARC

LLM Debate on ARC dataset

/ 100

Experimental

This project explores how a 'debate' among Large Language Models (LLMs) impacts their ability to solve abstract reasoning tasks, specifically those found in the ARC dataset. It takes a problem definition, often represented as input/output matrices, and uses either a direct answer from an LLM or code generated by an LLM. The output is an assessment of the LLM's performance based on how well its solution matches the correct answer. This is useful for researchers and practitioners working with advanced AI, particularly those focused on improving model reasoning and problem-solving.

No commits in the last 6 months.

Use this if you are researching advanced AI capabilities and want to understand how 'multi-agent debate' can influence a large language model's performance on complex, pattern-based reasoning problems like those in the ARC dataset.

Not ideal if you need a plug-and-play solution for general LLM fine-tuning or immediate deployment in a business application, as this is a research-focused exploration of reasoning techniques.

AI-research LLM-reasoning cognitive-AI abstract-reasoning AI-evaluation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Higher-rated alternatives

rentruewang/aioway

AI on the way. An RDBMS approach to deep learning. Declarative, explainable, scalable,...

miguelgfierro/ai_projects

AI projects

danijar/elements

Building blocks for productive research

wikimedia/articlequality

Github mirror - our actual code is hosted with Gerrit (please see...

wikimedia/editquality

Github mirror - our actual code is hosted with Gerrit (please see...

Explore ML Frameworks

All categories Trending ML Framework directory Insights