langwatch/langevals
LangEvals aggregates various language model evaluators into a single platform, providing a standard interface for a multitude of scores and LLM guardrails, for you to protect and benchmark your LLM models and pipelines.
This tool helps you evaluate and protect your language model applications by bringing together various assessment methods into one place. It takes your language model outputs and gives you a range of scores and safety checks. This is designed for anyone building or managing applications powered by large language models, such as product managers, AI safety engineers, or MLOps specialists.
Use this if you need a standardized way to measure the performance and safety of your language models and ensure they don't produce undesirable content.
Not ideal if you are looking for a tool to train or fine-tune language models, as this focuses solely on evaluation and guardrails.
Stars
71
Forks
10
Language
—
License
—
Category
Last pushed
Feb 15, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/langwatch/langevals"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
EvolvingLMMs-Lab/lmms-eval
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
vibrantlabsai/ragas
Supercharge Your LLM Application Evaluations 🚀
open-compass/VLMEvalKit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
EuroEval/EuroEval
The robust European language model benchmark.
Giskard-AI/giskard-oss
🐢 Open-Source Evaluation & Testing library for LLM Agents