DEK11/MoreNLP

Capabilities of StanfordNLP and OpenNLP on Spark

29
/ 100
Experimental

This project helps data scientists and NLP researchers who need to process large volumes of text efficiently. It takes raw text data and provides processed linguistic features like tokenized words, parts of speech, and recognized entities. You would use this to prepare text for further analysis or machine learning tasks.

No commits in the last 6 months.

Use this if you need a flexible way to apply standard Natural Language Processing (NLP) techniques to large text datasets, leveraging either Stanford NLP or OpenNLP within a Spark environment.

Not ideal if you require custom model training for advanced NLP tasks, or if your primary need is for stop-word removal without building a custom list.

text-analysis natural-language-processing big-data-text-processing linguistic-feature-extraction
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 9 / 25

How are scores calculated?

Stars

7

Forks

1

Language

Scala

License

Apache-2.0

Last pushed

Sep 23, 2018

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/DEK11/MoreNLP"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.