JohnSnowLabs/spark-nlp
State of the Art Natural Language Processing
This tool helps data scientists and AI/ML engineers quickly process and analyze large volumes of text, speech, and image data. It takes raw data like documents, audio, or images and applies advanced natural language processing (NLP) and machine learning models to extract insights such as sentiment, named entities, or translations. This is ideal for professionals building scalable AI applications that need to understand and generate human-like language across various industries.
4,116 stars. Actively maintained with 3 commits in the last 30 days.
Use this if you need to perform sophisticated text analysis, machine translation, sentiment analysis, or even image and speech processing tasks on very large datasets and require a solution that can scale efficiently in a distributed computing environment.
Not ideal if you are working with small, non-textual datasets and do not need the distributed processing capabilities of Apache Spark.
Stars
4,116
Forks
739
Language
Scala
License
Apache-2.0
Category
Last pushed
Mar 12, 2026
Commits (30d)
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/JohnSnowLabs/spark-nlp"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
dipanjanS/nlp_workshop_odsc_europe20
Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020....
JohnSnowLabs/nlu
1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and...
aaBadri/nlp-papers
Must-read papers on Natural Language Processing (NLP)
jairNeto/warren_buffet_letters
Repository using NLP techniques such as Transformers, Frequency analysis, document similarity at...
DmitryRyumin/EMNLP-2023-Papers
EMNLP 2023 Papers: Explore cutting-edge research from EMNLP 2023, the premier conference for...