obss/jury

Comprehensive NLP Evaluation System

50
/ 100
Established

This tool helps evaluate the performance of natural language processing (NLP) models by comparing their generated text against human-written references. You input the model's text outputs and the corresponding correct answers, and it provides various scores like BLEU, ROUGE, and BERTScore. It's designed for researchers and engineers who build and refine NLP systems like machine translation, text summarization, or chatbots.

188 stars. Used by 1 other package. No commits in the last 6 months. Available on PyPI.

Use this if you need to quickly and comprehensively assess how well your NLP model is performing across a variety of standard metrics.

Not ideal if you only need a single, specific metric or are not working with text generation or understanding tasks.

natural-language-processing machine-translation text-summarization chatbot-development nlp-model-evaluation
Stale 6m
Maintenance 0 / 25
Adoption 11 / 25
Maturity 25 / 25
Community 14 / 25

How are scores calculated?

Stars

188

Forks

19

Language

Python

License

MIT

Last pushed

Aug 08, 2024

Commits (30d)

0

Dependencies

9

Reverse dependents

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/obss/jury"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.