Arize-ai/phoenix

AI Observability & Evaluation

/ 100

Verified

This tool helps AI practitioners understand and improve their Large Language Model (LLM) applications. You input your LLM's interactions and performance metrics, and it provides insights into how well your models are working and where they might be going wrong. It's for anyone building, evaluating, or maintaining LLM-powered applications, such as AI product managers, machine learning engineers, and data scientists.

8,847 stars. Used by 7 other packages. Actively maintained with 271 commits in the last 30 days. Available on PyPI.

Use this if you need to track, evaluate, and troubleshoot the performance of your LLM-powered applications across different versions and prompts.

Not ideal if you are looking for a general-purpose monitoring tool for non-AI applications or traditional machine learning models.

LLM development AI evaluation Prompt engineering Model troubleshooting Experiment tracking

Maintenance 22 / 25

Adoption 15 / 25

Maturity 25 / 25

Community 19 / 25

How are scores calculated?

Stars

8,847

Forks

753

Language

Jupyter Notebook

License

—

Community Discussion

Waymo passenger flees after car drives on Phoenix light rail tracks 48 points · 77 comments · Jan 2026

Recent Releases

arize-phoenix-v14.2.1 10 Apr 2026 arize-phoenix-v14.2.0 10 Apr 2026 arize-phoenix-v14.1.1 08 Apr 2026 arize-phoenix-v14.1.0 08 Apr 2026 arize-phoenix-client-v2.3.1 07 Apr 2026

Compare

phoenix and langfuse phoenix and agenta phoenix and helicone phoenix and langtrace phoenix and langwatch phoenix and brokle

Related tools

langfuse/langfuse

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management,...

Mirascope/mirascope

The LLM Anti-Framework

Agenta-AI/agenta

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM...

Helicone/helicone

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

algorithmicsuperintelligence/optillm

Optimizing inference proxy for LLMs

Explore Prompt Engineering Tools

All categories Trending Prompt Engineering directory Insights