nervaluate and nereval
These are **competitors** — both provide entity-level NER evaluation metrics implementing similar methodologies (SemEval-based F1 scoring), with nervaluate being the more mature and widely-adopted option.
About nervaluate
MantisAI/nervaluate
Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13
When you're building systems that identify specific entities like people, organizations, or locations in text, it's crucial to accurately measure how well your system performs. This tool helps you evaluate your named entity recognition (NER) models by comparing your system's output against a set of known correct labels. It goes beyond simple word-by-word checks to tell you if the system got the whole entity right, partially right, or made a specific type of mistake. This is for anyone who needs to assess the quality of their text analysis systems, such as a computational linguist, data scientist, or researcher working with natural language processing.
About nereval
jantrienes/nereval
Evaluation script for named entity recognition (NER) systems based on entity-level F1 score.
This tool helps you assess how well your automated system identifies and categorizes specific entities within text, like product names or dates. You input a list of the entities your system found and compare it against a 'ground truth' list of what should have been found. The output is a clear F1-score, indicating the accuracy of your system's entity recognition. It's ideal for data scientists, NLP researchers, or anyone building systems that automatically extract structured information from unstructured text.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work