dokimos-dev/dokimos

Evaluation Framework for LLM applications in Java and Kotlin

/ 100

Emerging

Dokimos is an evaluation framework designed for developers building applications with Large Language Models (LLMs) in Java and Kotlin. It allows you to assess LLM responses for quality, such as hallucination or relevance, and track performance changes over time. Developers can integrate Dokimos into their existing test suites to ensure their LLM applications maintain high quality before deployment.

Use this if you are a Java or Kotlin developer building LLM-powered applications and need a robust way to automatically test and evaluate your LLM's responses and agent behavior.

Not ideal if you are not a Java or Kotlin developer, or if you are looking for a no-code solution to evaluate LLMs.

LLM development Java development Kotlin development Application quality assurance Automated testing

No Package No Dependents

Maintenance 10 / 25

Adoption 6 / 25

Maturity 13 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Java

License

MIT

Higher-rated alternatives

modelscope/evalscope

A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation...

izam-mohammed/ragrank

🎯 Your free LLM evaluation toolkit helps you assess the accuracy of facts, how well it...

Kareem-Rashed/rubric-eval

Independent framework to test, benchmark, and evaluate LLMs & AI agents locally.

justplus/llm-eval

大语言模型评估平台，支持多种评估基准、自定义数据集和性能测试。支持基于自定义数据集的RAG评估。

relari-ai/continuous-eval

Data-Driven Evaluation for LLM-Powered Applications

Explore RAG Tools

All categories Trending RAG directory Insights