proycon/nederlab-pipeline

Linguistic enrichment pipeline for historical dutch, as used in the Nederlab project

/ 100

Experimental

This tool helps researchers and linguists automatically analyze historical Dutch texts. It takes raw text data, primarily in the FoLiA XML format, and enriches it with detailed linguistic annotations like parts of speech, lemmas, language identification, named entities, and even modernised spellings for 17th-century Dutch. Digital humanists, computational linguists, and historians working with large archives of Middle Dutch or Early New Dutch documents would find this pipeline invaluable.

No commits in the last 6 months.

Use this if you need to perform advanced linguistic analysis on large collections of historical Dutch texts and require a structured output with detailed annotations.

Not ideal if your focus is on contemporary Dutch or languages other than Dutch, Middle Dutch, or Early New Dutch.

historical linguistics computational humanities digital archives Dutch studies text enrichment

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 16 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Groovy

License

—

Higher-rated alternatives

apache/opennlp

Apache OpenNLP

stanfordnlp/CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing,...

dkpro/dkpro-core

Collection of software components for natural language processing (NLP) based on the Apache UIMA...

stanfordnlp/python-stanford-corenlp

Python interface to CoreNLP using a bidirectional server-client interface.

apache/opennlp-sandbox

Apache OpenNLP Sandbox

Explore NLP Tools

All categories Trending NLP directory Insights