langfuse and phoenix

These are competitors offering overlapping LLM observability and evaluation capabilities, though Langfuse provides additional features like prompt management and playground while Phoenix focuses more narrowly on observability and evals.

langfuse
82
Verified
phoenix
81
Verified
Maintenance 22/25
Adoption 15/25
Maturity 25/25
Community 20/25
Maintenance 22/25
Adoption 15/25
Maturity 25/25
Community 19/25
Stars: 23,106
Forks: 2,333
Downloads:
Commits (30d): 252
Language: TypeScript
License:
Stars: 8,847
Forks: 753
Downloads:
Commits (30d): 271
Language: Jupyter Notebook
License:
No risk flags
No risk flags

About langfuse

langfuse/langfuse

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

This platform helps AI application developers build, test, and improve their large language model (LLM) powered products. It takes data from your LLM application's usage and provides tools for debugging, evaluating performance, and managing prompts. The end users are developers, machine learning engineers, and product managers working on AI applications.

AI-application-development LLM-observability prompt-engineering AI-testing machine-learning-operations

About phoenix

Arize-ai/phoenix

AI Observability & Evaluation

This tool helps AI practitioners understand and improve their Large Language Model (LLM) applications. You input your LLM's interactions and performance metrics, and it provides insights into how well your models are working and where they might be going wrong. It's for anyone building, evaluating, or maintaining LLM-powered applications, such as AI product managers, machine learning engineers, and data scientists.

LLM development AI evaluation Prompt engineering Model troubleshooting Experiment tracking

Scores updated daily from GitHub, PyPI, and npm data. How scores work