tokenmill/dictionary-annotator

Fast and configurable UIMA dictionary annotator.

/ 100

Experimental

This tool helps you quickly scan large amounts of text to find and label specific phrases, names, or terms from a dictionary (CSV file). For example, you can feed it a list of company names or medical conditions, and it will identify every mention in your documents, tagging each instance with relevant details like location or category. This is ideal for text analysis professionals, researchers, or anyone needing to extract structured information from unstructured text.

No commits in the last 6 months.

Use this if you need to rapidly find and categorize predefined phrases or entities in large text datasets and assign multiple, specific attributes to each match.

Not ideal if you need advanced natural language processing features beyond simple dictionary matching, such as sentiment analysis or deep semantic understanding.

text-mining information-extraction document-analysis data-labeling named-entity-recognition

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Java

License

—

Higher-rated alternatives

apache/opennlp

Apache OpenNLP

stanfordnlp/CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing,...

dkpro/dkpro-core

Collection of software components for natural language processing (NLP) based on the Apache UIMA...

stanfordnlp/python-stanford-corenlp

Python interface to CoreNLP using a bidirectional server-client interface.

apache/opennlp-sandbox

Apache OpenNLP Sandbox

Explore NLP Tools

All categories Trending NLP directory Insights