langwatch/langevals

LangEvals aggregates various language model evaluators into a single platform, providing a standard interface for a multitude of scores and LLM guardrails, for you to protect and benchmark your LLM models and pipelines.

41
/ 100
Emerging

This tool helps you evaluate and protect your language model applications by bringing together various assessment methods into one place. It takes your language model outputs and gives you a range of scores and safety checks. This is designed for anyone building or managing applications powered by large language models, such as product managers, AI safety engineers, or MLOps specialists.

Use this if you need a standardized way to measure the performance and safety of your language models and ensure they don't produce undesirable content.

Not ideal if you are looking for a tool to train or fine-tune language models, as this focuses solely on evaluation and guardrails.

LLM-evaluation AI-safety NLP-benchmarking model-guardrails AI-application-monitoring
No License No Package No Dependents
Maintenance 10 / 25
Adoption 9 / 25
Maturity 8 / 25
Community 14 / 25

How are scores calculated?

Stars

71

Forks

10

Language

License

Last pushed

Feb 15, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/langwatch/langevals"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.