proycon/nederlab-pipeline
Linguistic enrichment pipeline for historical dutch, as used in the Nederlab project
This tool helps researchers and linguists automatically analyze historical Dutch texts. It takes raw text data, primarily in the FoLiA XML format, and enriches it with detailed linguistic annotations like parts of speech, lemmas, language identification, named entities, and even modernised spellings for 17th-century Dutch. Digital humanists, computational linguists, and historians working with large archives of Middle Dutch or Early New Dutch documents would find this pipeline invaluable.
No commits in the last 6 months.
Use this if you need to perform advanced linguistic analysis on large collections of historical Dutch texts and require a structured output with detailed annotations.
Not ideal if your focus is on contemporary Dutch or languages other than Dutch, Middle Dutch, or Early New Dutch.
Stars
8
Forks
1
Language
Groovy
License
—
Category
Last pushed
Oct 13, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/proycon/nederlab-pipeline"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
apache/opennlp
Apache OpenNLP
stanfordnlp/CoreNLP
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing,...
dkpro/dkpro-core
Collection of software components for natural language processing (NLP) based on the Apache UIMA...
stanfordnlp/python-stanford-corenlp
Python interface to CoreNLP using a bidirectional server-client interface.
apache/opennlp-sandbox
Apache OpenNLP Sandbox