marcusklang/docria
Semi-structured Document Model (Next-generation)
Docria helps you organize and work with large collections of text documents, especially those with varied structures and origins. It allows you to store the original text along with annotations, relationships between parts of the text, and other extracted information. This is ideal for researchers, analysts, or anyone who needs to manage and process complex textual data like research papers, legal documents, or customer feedback.
Use this if you need a robust, language-independent way to store, share, and transform complex text documents, maintaining exact character positions and relationships within the text.
Not ideal if your data is purely tabular or consists of simple, flat records without complex nested structures or annotations on text spans.
Stars
8
Forks
1
Language
Java
License
Apache-2.0
Category
Last pushed
Jan 05, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/marcusklang/docria"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
apache/opennlp
Apache OpenNLP
stanfordnlp/CoreNLP
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing,...
dkpro/dkpro-core
Collection of software components for natural language processing (NLP) based on the Apache UIMA...
stanfordnlp/python-stanford-corenlp
Python interface to CoreNLP using a bidirectional server-client interface.
apache/opennlp-sandbox
Apache OpenNLP Sandbox