langfuse and langkit
These are complements that work together: LangKit extracts monitoring signals (text quality, safety metrics) from LLM inputs/outputs that Langfuse can ingest and visualize within its broader observability platform.
About langfuse
langfuse/langfuse
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
This platform helps AI application developers build, test, and improve their large language model (LLM) powered products. It takes data from your LLM application's usage and provides tools for debugging, evaluating performance, and managing prompts. The end users are developers, machine learning engineers, and product managers working on AI applications.
About langkit
whylabs/langkit
🔍 LangKit: An open-source toolkit for monitoring Large Language Models (LLMs). 📚 Extracts signals from prompts & responses, ensuring safety & security. 🛡️ Features include text quality, relevance metrics, & sentiment analysis. 📊 A comprehensive tool for LLM observability. 👀
This toolkit helps data scientists and ML engineers proactively monitor the behavior of their language models, including LLMs, in production. It takes the text prompts and responses from your model and extracts various signals like text quality, relevance, sentiment, and potential security risks. The output is a set of metrics that provide deep insights into how your language model is performing and interacting with users.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work