jennyzzt/LLM_debate_on_ARC

LLM Debate on ARC dataset

20
/ 100
Experimental

This project explores how a 'debate' among Large Language Models (LLMs) impacts their ability to solve abstract reasoning tasks, specifically those found in the ARC dataset. It takes a problem definition, often represented as input/output matrices, and uses either a direct answer from an LLM or code generated by an LLM. The output is an assessment of the LLM's performance based on how well its solution matches the correct answer. This is useful for researchers and practitioners working with advanced AI, particularly those focused on improving model reasoning and problem-solving.

No commits in the last 6 months.

Use this if you are researching advanced AI capabilities and want to understand how 'multi-agent debate' can influence a large language model's performance on complex, pattern-based reasoning problems like those in the ARC dataset.

Not ideal if you need a plug-and-play solution for general LLM fine-tuning or immediate deployment in a business application, as this is a research-focused exploration of reasoning techniques.

AI-research LLM-reasoning cognitive-AI abstract-reasoning AI-evaluation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

7

Forks

Language

Python

License

MIT

Last pushed

Jun 15, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/jennyzzt/LLM_debate_on_ARC"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.