Shalev-Lifshitz/MultiAgentVerification

Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers

/ 100

Experimental

This project helps improve the accuracy and reliability of large language models (LLMs) when they generate answers to complex problems, such as mathematical equations or factual questions. It takes candidate solutions from an LLM and uses multiple specialized 'verifiers'—which are themselves LLMs prompted to check specific aspects—to evaluate those solutions. The output is a significantly more accurate final answer. This tool is for AI researchers and practitioners who are building or deploying LLMs and need to maximize their performance on specific tasks without retraining them.

No commits in the last 6 months.

Use this if you need to boost the performance of an existing large language model on specific tasks by using multiple specialized verifiers to evaluate its outputs, rather than relying on a single verification method.

Not ideal if you are looking for a method to train or fine-tune LLMs, or if you only need a single, general verifier for your model's outputs.

AI-performance-optimization LLM-evaluation model-reliability generative-AI AI-safety

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

miquido/draive

All-in-one Python framework for building production-ready LLM agents and workflows. Developed by Miquido.

monocle2ai/monocle

Monocle is a framework for tracing GenAI app code. This repo contains implementation of Monocle...

binome-dev/graphite

Graphite Agentic Framework by Binome Technologies

privacera/paig

PAIG (Pronounced similar to paige or payj) is an open-source project designed to protect...

skaiworldwide-oss/agensgraph-python

A Python driver for AgensGraph Multi-Model Database

Explore Generative AI Tools

All categories Trending Generative AI directory Insights