disi-unibo-nlp/nlg-metricverse

[COLING22] An End-to-End Library for Evaluating Natural Language Generation

/ 100

Emerging

When developing or researching Natural Language Generation (NLG) models, it's crucial to accurately assess their output. This tool takes your model's generated text and a set of human-written reference texts, then calculates various automatic evaluation scores. It's designed for NLP researchers and engineers who need to understand and compare the quality of different NLG systems, such as those used for summarization, translation, or chatbots.

No commits in the last 6 months. Available on PyPI.

Use this if you are developing or fine-tuning NLG models and need a comprehensive, consistent way to evaluate their performance using a wide array of automatic metrics.

Not ideal if you primarily rely on human evaluation or only need a single, basic metric for a well-established NLG task.

natural-language-generation model-evaluation text-summarization machine-translation chatbot-development

Stale 6m

Maintenance 0 / 25

Adoption 9 / 25

Maturity 25 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

google/langfun

OO for LLMs

tanaos/artifex

Small Language Model Inference, Fine-Tuning and Observability. No GPU, no labeled data needed.

preligens-lab/textnoisr

Adding random noise to a text dataset, and controlling very accurately the quality of the result

vulnerability-lookup/VulnTrain

A tool to generate datasets and models based on vulnerabilities descriptions from @Vulnerability-Lookup.

masakhane-io/masakhane-mt

Machine Translation for Africa

Explore NLP Tools

All categories Trending NLP directory Insights