geval-labs/geval

Eval-driven release gates for AI applications

/ 100

Emerging

This tool helps AI application teams automate the decision of whether to release a new AI model or prompt. You feed it various performance metrics and business rules, and it consistently produces a clear outcome: 'PASS', 'REQUIRE_APPROVAL', or 'BLOCK'. It's for anyone managing the release of AI features, like AI product managers, machine learning engineers, or release managers.

Use this if you need a consistent, auditable, and automated way to decide on the release readiness of AI application changes based on multiple criteria and signals.

Not ideal if you need an AI system to make the release decisions for you, as this tool only applies your predefined rules.

AI release management MLOps AI product development decision automation quality gates

No Package No Dependents

Maintenance 10 / 25

Adoption 5 / 25

Maturity 11 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

TypeScript

License

MIT

Featured in

You're Shipping AI You Can't Measure

Higher-rated alternatives

StonyBrookNLP/appworld

🌍 AppWorld: A Controllable World of Apps and People for Benchmarking Function Calling and...

qualifire-dev/rogue

AI Agent Evaluator & Red Team Platform

microsoft/WindowsAgentArena

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of...

future-agi/ai-evaluation

Evaluation Framework for all your AI related Workflows

RouteWorks/RouterArena

RouterArena: An open framework for evaluating LLM routers with standardized datasets, metrics,...

Explore AI Agents

All categories Trending AI Agent directory Insights