athina-ai/athina-sdk
LLM Testing SDK that helps you write and run tests to monitor your LLM app in production
This tool helps you ensure the reliability and quality of outputs from your AI applications. It takes your LLM application's outputs and a set of predefined tests, then evaluates how well the outputs meet specific criteria. This is for anyone building or managing an application powered by Large Language Models (LLMs) who needs to verify the consistency and accuracy of the AI's responses.
132 stars. No commits in the last 6 months.
Use this if you need to systematically test, monitor, and improve the performance of your LLM-powered application, both during development and in live production.
Not ideal if you are not working with Large Language Models or if your primary concern is traditional software unit testing.
Stars
132
Forks
1
Language
Python
License
—
Category
Last pushed
Jan 22, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/athina-ai/athina-sdk"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral,...
IBM/unitxt
🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the...
lean-dojo/LeanDojo
Tool for data extraction and interacting with Lean programmatically.
GoodStartLabs/AI_Diplomacy
Frontier Models playing the board game Diplomacy.
google/litmus
Litmus is a comprehensive LLM testing and evaluation tool designed for GenAI Application...