langfuse and phoenix

These are competitors offering overlapping LLM observability and evaluation capabilities, though Langfuse provides additional features like prompt management and playground while Phoenix focuses more narrowly on observability and evals.

langfuse

Verified

phoenix

Verified

Maintenance 22/25

Adoption 15/25

Maturity 25/25

Community 20/25

Maintenance 22/25

Adoption 15/25

Maturity 25/25

Community 19/25

Stars: 23,106

Forks: 2,333

Downloads: —

Commits (30d): 252

Language: TypeScript

License: —

Stars: 8,847

Forks: 753

Downloads: —

Commits (30d): 271

Language: Jupyter Notebook

License: —

No risk flags

About langfuse

langfuse/langfuse

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

This platform helps AI application developers build, test, and improve their large language model (LLM) powered products. It takes data from your LLM application's usage and provides tools for debugging, evaluating performance, and managing prompts. The end users are developers, machine learning engineers, and product managers working on AI applications.

AI-application-development LLM-observability prompt-engineering AI-testing machine-learning-operations

About phoenix

Arize-ai/phoenix

AI Observability & Evaluation

This tool helps AI practitioners understand and improve their Large Language Model (LLM) applications. You input your LLM's interactions and performance metrics, and it provides insights into how well your models are working and where they might be going wrong. It's for anyone building, evaluating, or maintaining LLM-powered applications, such as AI product managers, machine learning engineers, and data scientists.

LLM development AI evaluation Prompt engineering Model troubleshooting Experiment tracking

Related comparisons

langfuse and agenta langfuse and helicone langfuse and LLMstudio langfuse and langtrace langfuse and langfuse-java langfuse and langkit

Scores updated daily from GitHub, PyPI, and npm data. How scores work