ragrank and llm-evaluation

ragrank

Established

llm-evaluation

Experimental

Maintenance 10/25

Adoption 8/25

Maturity 16/25

Community 18/25

Maintenance 10/25

Adoption 3/25

Maturity 3/25

Community 12/25

Stars: 45

Forks: 14

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

Stars: 3

Forks: 1

Downloads: —

Commits (30d): 0

Language: HTML

License: —

No Package No Dependents

No License No Package No Dependents

About ragrank

izam-mohammed/ragrank

🎯 Your free LLM evaluation toolkit helps you assess the accuracy of facts, how well it understands context, its tone, and more. This helps you see how good your LLM applications are.

This toolkit helps you assess the performance of your Retrieval-Augmented Generation (RAG) applications. You provide your RAG model's questions, the contexts it retrieves, and its generated responses, and it gives you metrics on factual accuracy, context understanding, and tone. This is for AI/ML engineers, data scientists, or product managers who build and deploy LLM applications and need to ensure their RAG systems are delivering high-quality, reliable outputs.

LLM application development RAG system evaluation AI model quality assurance Natural Language Processing Generative AI

About llm-evaluation

amitbad/llm-evaluation

Hands-on LLM evaluation learning repo — local models via Ollama, no paid APIs, no maths. Covers deterministic eval, LLM-as-a-Judge, hallucination testing, prompt injection, RAG evaluation, and agent trajectory scoring.

Related comparisons

ragrank and evalscope ragrank and llm-eval ragrank and llm-eval-bench ragrank and RagaliQ

Scores updated daily from GitHub, PyPI, and npm data. How scores work