adobe/NLP-Cube

Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing

/ 100

Emerging

This tool helps computational linguists or data scientists break down raw text into its fundamental components. You input a string of text, and it outputs a detailed linguistic analysis, including sentence boundaries, individual words, their base forms (lemmas), their grammatical roles (parts-of-speech), and how words relate to each other in a sentence. It's ideal for anyone who needs to perform deep textual analysis across many languages.

563 stars. No commits in the last 6 months.

Use this if you need to process raw text by segmenting sentences, tokenizing words, identifying their parts-of-speech, lemmatizing them, or understanding their grammatical dependencies across a wide range of languages.

Not ideal if you need a simple keyword extraction tool or a sentiment analysis system without requiring the underlying detailed linguistic structure.

natural-language-processing computational-linguistics text-analysis data-preparation multilingual-data

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 22 / 25

How are scores calculated?

Stars

563

Forks

Language

HTML

License

Apache-2.0

Related tools

cspnms/MSchunker

Smart text chunker for LLM preprocessing (sections → paragraphs → sentences → hard splits).

Explore NLP Tools

All categories Trending NLP directory Insights