dokimos-dev/dokimos
Evaluation Framework for LLM applications in Java and Kotlin
Dokimos is an evaluation framework designed for developers building applications with Large Language Models (LLMs) in Java and Kotlin. It allows you to assess LLM responses for quality, such as hallucination or relevance, and track performance changes over time. Developers can integrate Dokimos into their existing test suites to ensure their LLM applications maintain high quality before deployment.
Use this if you are a Java or Kotlin developer building LLM-powered applications and need a robust way to automatically test and evaluate your LLM's responses and agent behavior.
Not ideal if you are not a Java or Kotlin developer, or if you are looking for a no-code solution to evaluate LLMs.
Stars
18
Forks
2
Language
Java
License
MIT
Category
Last pushed
Mar 08, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/dokimos-dev/dokimos"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
modelscope/evalscope
A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation...
izam-mohammed/ragrank
🎯 Your free LLM evaluation toolkit helps you assess the accuracy of facts, how well it...
Kareem-Rashed/rubric-eval
Independent framework to test, benchmark, and evaluate LLMs & AI agents locally.
justplus/llm-eval
大语言模型评估平台,支持多种评估基准、自定义数据集和性能测试。支持基于自定义数据集的RAG评估。
relari-ai/continuous-eval
Data-Driven Evaluation for LLM-Powered Applications