ricsinaruto/dialog-eval
Evaluate your dialog model with 17 metrics! (see paper)
This project helps evaluate how well your automated dialogue system (like a chatbot) is performing. You provide your chatbot's generated responses, alongside the expected human-like responses and optionally, your training data. The system then outputs a comprehensive report with 17 different metrics, helping you understand the quality and characteristics of your bot's conversations. It's designed for anyone building and refining conversational AI, such as AI researchers or machine learning engineers working on dialogue systems.
No commits in the last 6 months.
Use this if you need to quantitatively measure and compare the performance of different chatbot models using a wide range of established linguistic and semantic metrics.
Not ideal if you're looking for a user-interface-driven tool for live chatbot testing or qualitative human-in-the-loop evaluation.
Stars
97
Forks
19
Language
Python
License
MIT
Category
Last pushed
Aug 07, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/ricsinaruto/dialog-eval"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
harmonydata/harmony
The Harmony Python library: a research tool for psychologists to harmonise data and...
yannvgn/laserembeddings
LASER multilingual sentence embeddings as a pip package
embeddings-benchmark/results
Data for the MTEB leaderboard
Hironsan/awesome-embedding-models
A curated list of awesome embedding models tutorials, projects and communities.