promptfoo and promptfoo-action

The GitHub Action is a wrapper that integrates the core testing framework into CI/CD pipelines, making them complements designed to be used together rather than alternatives.

promptfoo

Verified

promptfoo-action

Established

Maintenance 22/25

Adoption 14/25

Maturity 25/25

Community 20/25

Maintenance 10/25

Adoption 8/25

Maturity 16/25

Community 20/25

Stars: 14,219

Forks: 1,297

Downloads: —

Commits (30d): 380

Language: TypeScript

License: MIT

Stars: 47

Forks: 23

Downloads: —

Commits (30d): 0

Language: TypeScript

License: MIT

No risk flags

No Package No Dependents

About promptfoo

promptfoo/promptfoo

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

This tool helps AI developers and engineers evaluate and secure their large language model (LLM) applications. You provide your prompts, models (like GPT, Claude, or Llama), and test cases, and it generates performance comparisons and vulnerability reports. This is ideal for anyone building or deploying AI systems and needing to ensure their reliability and safety.

LLM development AI security Prompt engineering Model evaluation AI red teaming

About promptfoo-action

promptfoo/promptfoo-action

The GitHub Action for Promptfoo. Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

This GitHub Action helps AI developers, ML engineers, and prompt engineers automatically test their large language model (LLM) prompts and RAG (Retrieval Augmented Generation) systems. When you modify prompts in your code, it evaluates the changes and provides a "before/after" comparison directly in your pull request. This allows you to quickly see how prompt edits impact model performance and identify regressions or improvements.

LLM development Prompt engineering AI red teaming Model evaluation Continuous integration

Related comparisons

promptfoo and recursive-prompt-improver

Scores updated daily from GitHub, PyPI, and npm data. How scores work