PacificAI/langtest
Deliver safe & effective language models
This helps AI developers and MLOps engineers ensure their language models are fair, robust, and accurate before deployment. You feed it your language model and it outputs a comprehensive report detailing performance across various tests, along with suggestions for data augmentation. This tool is for anyone building, evaluating, or maintaining large language models or other NLP systems.
552 stars.
Use this if you need to thoroughly test language models for biases, fairness, robustness, and accuracy, especially in regulated or sensitive domains like healthcare or finance.
Not ideal if you are looking for a basic model performance metric or a tool that doesn't focus on responsible AI testing.
Stars
552
Forks
50
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 19, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/PacificAI/langtest"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
microsoft/OpenRCA
[ICLR'25] OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?
Babelscape/ALERT
Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language...
TrustGen/TrustEval-toolkit
[ICLR'26, NAACL'25 Demo] Toolkit & Benchmark for evaluating the trustworthiness of generative...
ChenWu98/agent-attack
[ICLR 2025] Dissecting adversarial robustness of multimodal language model agents
Trust4AI/ASTRAL
Automated Safety Testing of Large Language Models