ricsinaruto/dialog-eval

Evaluate your dialog model with 17 metrics! (see paper)

/ 100

Emerging

This project helps evaluate how well your automated dialogue system (like a chatbot) is performing. You provide your chatbot's generated responses, alongside the expected human-like responses and optionally, your training data. The system then outputs a comprehensive report with 17 different metrics, helping you understand the quality and characteristics of your bot's conversations. It's designed for anyone building and refining conversational AI, such as AI researchers or machine learning engineers working on dialogue systems.

No commits in the last 6 months.

Use this if you need to quantitatively measure and compare the performance of different chatbot models using a wide range of established linguistic and semantic metrics.

Not ideal if you're looking for a user-interface-driven tool for live chatbot testing or qualitative human-in-the-loop evaluation.

chatbot development conversational AI dialogue system evaluation natural language generation AI model assessment

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Featured in

Embeddings Are Easier Than Whatever You're Doing Instead You're Shipping AI You Can't Measure

Higher-rated alternatives

embeddings-benchmark/mteb

MTEB: Massive Text Embedding Benchmark

harmonydata/harmony

The Harmony Python library: a research tool for psychologists to harmonise data and...

yannvgn/laserembeddings

LASER multilingual sentence embeddings as a pip package

embeddings-benchmark/results

Data for the MTEB leaderboard

Hironsan/awesome-embedding-models

A curated list of awesome embedding models tutorials, projects and communities.

Explore Embedding Tools

All categories Trending Embeddings directory Insights