tokenmill/dictionary-annotator
Fast and configurable UIMA dictionary annotator.
This tool helps you quickly scan large amounts of text to find and label specific phrases, names, or terms from a dictionary (CSV file). For example, you can feed it a list of company names or medical conditions, and it will identify every mention in your documents, tagging each instance with relevant details like location or category. This is ideal for text analysis professionals, researchers, or anyone needing to extract structured information from unstructured text.
No commits in the last 6 months.
Use this if you need to rapidly find and categorize predefined phrases or entities in large text datasets and assign multiple, specific attributes to each match.
Not ideal if you need advanced natural language processing features beyond simple dictionary matching, such as sentiment analysis or deep semantic understanding.
Stars
7
Forks
—
Language
Java
License
—
Category
Last pushed
Apr 17, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/tokenmill/dictionary-annotator"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
apache/opennlp
Apache OpenNLP
stanfordnlp/CoreNLP
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing,...
dkpro/dkpro-core
Collection of software components for natural language processing (NLP) based on the Apache UIMA...
stanfordnlp/python-stanford-corenlp
Python interface to CoreNLP using a bidirectional server-client interface.
apache/opennlp-sandbox
Apache OpenNLP Sandbox