obss/jury

Comprehensive NLP Evaluation System

/ 100

Established

This tool helps evaluate the performance of natural language processing (NLP) models by comparing their generated text against human-written references. You input the model's text outputs and the corresponding correct answers, and it provides various scores like BLEU, ROUGE, and BERTScore. It's designed for researchers and engineers who build and refine NLP systems like machine translation, text summarization, or chatbots.

188 stars. Used by 1 other package. No commits in the last 6 months. Available on PyPI.

Use this if you need to quickly and comprehensively assess how well your NLP model is performing across a variety of standard metrics.

Not ideal if you only need a single, specific metric or are not working with text generation or understanding tasks.

natural-language-processing machine-translation text-summarization chatbot-development nlp-model-evaluation

Stale 6m

Maintenance 0 / 25

Adoption 11 / 25

Maturity 25 / 25

Community 14 / 25

How are scores calculated?

Stars

188

Forks

Language

Python

License

MIT

Related tools

grobidOrg/grobid

A machine learning software for extracting information from scholarly documents

lihanghang/NLP-Knowledge-Graph

自然语言处理、知识图谱、对话系统，大模型等技术研究与应用。

yzhangcs/parser

:rocket: State-of-the-art parsers for natural language.

alibaba/EasyNLP

EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit

polakowo/textai

Applications using state-of-the-art in NLP

Explore NLP Tools

All categories Trending NLP directory Insights