tspannhw/nifi-langdetect-processor
Apache NiFi + Apache Tika + OptimaizeLangDetector
This tool helps you automatically identify the language of text data flowing through your systems, like English, Spanish, or over 100 other languages. It takes unstructured text as input and outputs the detected language code. Operations engineers or data processors dealing with multinational data streams would find this useful.
No commits in the last 6 months.
Use this if you need to automatically categorize or route text-based information based on its language within an Apache NiFi data flow.
Not ideal if you need to translate text or perform deep linguistic analysis beyond language identification.
Stars
7
Forks
1
Language
Java
License
Apache-2.0
Category
Last pushed
May 20, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/tspannhw/nifi-langdetect-processor"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
apache/opennlp
Apache OpenNLP
stanfordnlp/CoreNLP
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing,...
stanfordnlp/python-stanford-corenlp
Python interface to CoreNLP using a bidirectional server-client interface.
dkpro/dkpro-core
Collection of software components for natural language processing (NLP) based on the Apache UIMA...
apache/opennlp-sandbox
Apache OpenNLP Sandbox