prompt-evaluator and eval-data

prompt-evaluator
34
Emerging
eval-data
29
Experimental
Maintenance 6/25
Adoption 3/25
Maturity 13/25
Community 12/25
Maintenance 2/25
Adoption 6/25
Maturity 8/25
Community 13/25
Stars: 4
Forks: 1
Downloads:
Commits (30d): 0
Language: TypeScript
License: MIT
Stars: 17
Forks: 3
Downloads:
Commits (30d): 0
Language: TypeScript
License:
No Package No Dependents
No License Stale 6m No Package No Dependents

About prompt-evaluator

syamsasi99/prompt-evaluator

prompt-evaluator is an open-source toolkit for evaluating, testing, and comparing LLM prompts. It provides a GUI-driven workflow for running prompt tests, tracking token usage, visualizing results, and ensuring reliability across models like OpenAI, Claude, and Gemini.

About eval-data

paradite/eval-data

Prompts and evaluation data for LLMs on real world coding and writing tasks

This provides a collection of prompts and expected outputs for evaluating how well large language models (LLMs) perform on various real-world coding and writing tasks. It takes in a specific task scenario (like writing a Next.js todo app or explaining Kanji) and provides benchmark data to assess an LLM's generated code or text. This is designed for AI researchers, prompt engineers, and product managers who are developing or integrating LLM-powered applications.

LLM-evaluation prompt-engineering AI-benchmarking code-generation-testing content-creation-assessment

Scores updated daily from GitHub, PyPI, and npm data. How scores work