OpenDCAI/One-Eval

Automated system for LLM evaluation via agents.

39
/ 100
Emerging

One-Eval helps AI product managers and researchers automatically assess the quality and performance of large language models (LLMs). You input your evaluation goals in natural language, and it outputs comprehensive reports detailing how well the LLM performs on various tasks like reasoning or general knowledge. This tool is for anyone responsible for developing, deploying, or overseeing LLM-powered applications.

Use this if you need to quickly and automatically evaluate your LLMs using natural language prompts, eliminating the need for manual script writing and benchmark configuration.

Not ideal if your evaluation requires complex sandbox environments or involves non-text-based LLM capabilities like code execution or Text2SQL, which are not yet fully supported.

LLM evaluation AI model testing natural language processing model quality assurance AI research
No Package No Dependents
Maintenance 13 / 25
Adoption 6 / 25
Maturity 13 / 25
Community 7 / 25

How are scores calculated?

Stars

24

Forks

2

Language

Python

License

Apache-2.0

Last pushed

Mar 19, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/OpenDCAI/One-Eval"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.