Yuyz0112/relia
Find the Best LLM for Your Needs through E2E Testing
Relia helps AI application developers select, optimize, and continuously monitor large language models (LLMs) for their specific needs. It takes your LLM and test cases as input and outputs a benchmark report, ensuring high performance and cost efficiency. This tool is ideal for engineers building AI agents or applications that rely on LLM 'function calling' to automate tasks.
No commits in the last 6 months.
Use this if you are building an AI application and need to rigorously test different LLMs, prompts, or model versions to ensure your application performs reliably, especially for 'tool use' scenarios.
Not ideal if you are looking for a general-purpose LLM evaluation framework that doesn't focus specifically on function calling or end-to-end application-level testing.
Stars
83
Forks
2
Language
TypeScript
License
Apache-2.0
Category
Last pushed
Jun 12, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/Yuyz0112/relia"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral,...
IBM/unitxt
🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the...
lean-dojo/LeanDojo
Tool for data extraction and interacting with Lean programmatically.
GoodStartLabs/AI_Diplomacy
Frontier Models playing the board game Diplomacy.
google/litmus
Litmus is a comprehensive LLM testing and evaluation tool designed for GenAI Application...