pyrouge and rouge
These are competitors offering alternative implementations of the same ROUGE metric family for evaluating summarization quality, with the first being a wrapper around the canonical ROUGE package while the second is a standalone implementation.
About pyrouge
bheinzerling/pyrouge
A Python wrapper for the ROUGE summarization evaluation package
This tool helps researchers and developers working on text summarization evaluate the quality of their automatically generated summaries. It takes your plain text summaries and corresponding 'gold standard' reference summaries, then processes them to produce standardized ROUGE scores. Anyone building or comparing different text summarization models would use this to quantify performance.
About rouge
neural-dialogue-metrics/rouge
An implementation of ROUGE family metrics for automatic summarization.
This tool helps researchers and developers working with natural language processing evaluate the quality of automatically generated summaries or translations. You input two pieces of text: a reference (the 'correct' version) and a candidate (the machine-generated version). It outputs scores (recall, precision, and F-measure) that indicate how well the candidate text matches the reference.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work