All NLP Tools

13,598 tools ranked by quality score

Showing 1–100 of 13,598
# Tool Score Tier
1 PyThaiNLP/pythainlp

Thai natural language processing in Python

90
Verified
2 nltk/nltk

NLTK Source

87
Verified
3 explosion/spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

82
Verified
4 chrismattmann/tika-python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing...

81
Verified
5 sloria/TextBlob

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech...

81
Verified
6 undertheseanlp/underthesea

Underthesea - Vietnamese NLP Toolkit

80
Verified
7 urchade/GLiNER

Generalist and Lightweight Model for Named Entity Recognition (Extract any...

79
Verified
8 google/sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

78
Verified
9 spencermountain/compromise

modest natural-language processing

78
Verified
10 textlint/textlint

textlint is the pluggable linter for natural language text.

77
Verified
11 deepdoctection/deepdoctection

A Repo For Document AI

76
Verified
12 acl-org/acl-anthology

Data and software for building the ACL Anthology.

76
Verified
13 quanteda/quanteda

An R package for the Quantitative Analysis of Textual Data

74
Verified
14 gunthercox/chatterbot-corpus

A multilingual dialog corpus

74
Verified
15 apache/opennlp

Apache OpenNLP

74
Verified
16 miso-belica/sumy

Module for automatic summarization of text documents and HTML pages.

73
Verified
17 NPC-Worldwide/npcpy

The python library for research and development in NLP, multimodal LLMs,...

73
Verified
18 EmilStenstrom/conllu

A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a...

73
Verified
19 flairNLP/fundus

A very simple news crawler with a funny name

72
Verified
20 google/langfun

OO for LLMs

72
Verified
21 jxmorris12/language_tool_python

a free python grammar checker 📝✅

71
Verified
22 isaacus-dev/semchunk

A fast, lightweight and easy-to-use Python library for splitting text into...

71
Verified
23 stanfordnlp/stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER,...

71
Verified
24 languagetool-org/languagetool

Style and Grammar Checker for 25+ Languages

71
Verified
25 malaysia-ai/malaya

Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/

71
Verified
26 hyperquest-hq/hyperbase

A foundational library for Semantic Hypergraphs

70
Verified
27 cltk/cltk

The Classical Language Toolkit

70
Verified
28 hellohaptik/chatbot_ner

chatbot_ner: Named Entity Recognition for chatbots.

70
Verified
29 JoeanAmier/XHS-Downloader

小红书(XiaoHongShu、RedNote)链接提取/作品采集工具:提取账号发布、收藏、点赞、专辑作品链接;提取搜索结果作品、用户链接;采集小红书作品...

69
Established
30 unitaryai/detoxify

Trained models & code to predict toxic comments on all 3 Jigsaw Toxic...

69
Established
31 lovit/soynlp

한국어 자연어처리를 위한 파이썬 라이브러리입니다. 단어 추출/ 토크나이저 / 품사판별/ 전처리의 기능을 제공합니다.

69
Established
32 davidsbatista/BREDS

"Bootstrapping Relationship Extractors with Distributional Semantics"...

69
Established
33 DerwenAI/pytextrank

Python implementation of TextRank algorithms ("textgraphs") for phrase extraction

69
Established
34 google/langextract

A Python library for extracting structured information from unstructured...

69
Established
35 PyThaiNLP/nlpo3

Thai natural language processing library in Rust, with Python and Node bindings.

69
Established
36 hunspell/hunspell

The most popular spellchecking library.

69
Established
37 robocorp/rpaframework

Collection of open-source libraries and tools for Robotic Process Automation...

69
Established
38 chatopera/Synonyms

:herb: 中文近义词:聊天机器人,智能问答工具包

69
Established
39 deanmalmgren/textract

extract text from any document. no muss. no fuss.

68
Established
40 hplt-project/sacremoses

Python port of Moses tokenizer, truecaser and normalizer

68
Established
41 dkpro/dkpro-cassis

UIMA CAS processing library written in Python

68
Established
42 flairNLP/flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

68
Established
43 CUNY-CL/wikipron

Massively multilingual pronunciation mining

68
Established
44 estnltk/estnltk

Open source tools for Estonian natural language processing

67
Established
45 ziqizhang/jate

JATE - Just Automatic Term Extraction (in Python)

67
Established
46 bab2min/kiwipiepy

Python API for Kiwi

67
Established
47 discopy/discopy

The Python toolkit for computing with string diagrams.

67
Established
48 grobidOrg/grobid

A machine learning software for extracting information from scholarly documents

67
Established
49 CAMeL-Lab/camel_tools

A suite of Arabic natural language processing tools developed by the CAMeL...

67
Established
50 MIND-Lab/OCTIS

OCTIS: Comparing Topic Models is Simple! A python package to optimize and...

67
Established
51 hankcs/HanLP

Natural Language Processing for the next decade. Tokenization,...

67
Established
52 forzagreen/n2words

Convert numerical numbers to written numbers, in 52+ languages.

67
Established
53 aphp/edsnlp

Modular, fast NLP framework, compatible with Pytorch and spaCy, offering...

66
Established
54 kenlimmj/rouge

A Javascript implementation of the Recall-Oriented Understudy for Gisting...

66
Established
55 OpenPecha/Botok

🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python

66
Established
56 Helsinki-NLP/OpusFilter

OpusFilter - Parallel corpus processing toolkit

65
Established
57 jacksonllee/pylangacq

Language Acquisition Research Tools

65
Established
58 NatLibFi/Annif

Annif is a multi-algorithm automated subject indexing tool for libraries,...

65
Established
59 MAIF/melusine

📧 Melusine: Use python to automatize your email processing workflow

65
Established
60 polyrabbit/WeCron

:heavy_check_mark: 微信上的定时提醒 - Cron on WeChat

65
Established
61 goodmami/wn

A modern, interlingual wordnet interface for Python

65
Established
62 winkjs/wink-nlp

Developer friendly Natural Language Processing ✨

65
Established
63 bab2min/Kiwi

Kiwi(지능형 한국어 형태소 분석기)

65
Established
64 Alir3z4/python-stop-words

Get list of common stop words in various languages in Python

65
Established
65 allenai/scispacy

A full spaCy pipeline and models for scientific/biomedical documents.

64
Established
66 ryanjgallagher/shifterator

Interpretable data visualizations for understanding how texts differ at the...

64
Established
67 rmovva/HypotheSAEs

HypotheSAEs: hypothesizing interpretable relationships in text datasets...

64
Established
68 dsfsi/textaugment

TextAugment: Text Augmentation Library

64
Established
69 facebookresearch/stopes

A library for preparing data for machine translation research (monolingual...

64
Established
70 anoopkunchukuttan/indic_nlp_library

Resources and tools for Indian language Natural Language Processing

64
Established
71 adbar/htmldate

Fast and robust date extraction from web pages, with Python or on the command-line

64
Established
72 jacksonllee/pycantonese

Cantonese Linguistics and NLP

64
Established
73 huggingface/setfit

Efficient few-shot learning with Sentence Transformers

64
Established
74 chatopera/efaqa-corpus-zh

❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库

64
Established
75 wooorm/franc

Natural language detection

63
Established
76 GaoQ1/rasa_nlu_gq

turn natural language into structured data(支持中文,自定义了N种模型,支持不同的场景和任务)

63
Established
77 gunthercox/mathparse

A Python library for evaluating natural language mathematical equations

63
Established
78 DataFog/datafog-python

Python SDK for PII detection and redaction in text and images, combining...

63
Established
79 angelosalatino/cso-classifier

Python library that classifies content from scientific papers with the...

63
Established
80 ChenghaoMou/text-dedup

All-in-one text de-duplication

63
Established
81 Ayanami0730/deep_research_bench

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

63
Established
82 JohnSnowLabs/spark-nlp

State of the Art Natural Language Processing

63
Established
83 HIT-SCIR/ltp

Language Technology Platform

63
Established
84 vmenger/deduce

Deduce: de-identification method for Dutch medical text

63
Established
85 i-dot-ai/themefinder

A topic modelling Python package for analysing one-to-many question-answer data.

63
Established
86 laugustyniak/awesome-sentiment-analysis

Repository with all what is necessary for sentiment analysis and related areas

62
Established
87 NateScarlet/holiday-cn

📅🇨🇳中国法定节假日数据 自动每日抓取国务院公告

62
Established
88 maximtrp/bitermplus

Biterm Topic Model (BTM): modeling topics in short texts

62
Established
89 MantisAI/nervaluate

Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13

62
Established
90 segment-any-text/wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust,...

62
Established
91 vi3k6i5/flashtext

Extract Keywords from sentence or Replace keywords in sentences.

62
Established
92 dongrixinyu/JioNLP

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package...

62
Established
93 greyblake/whatlang-rs

Natural language detection library for Rust. Try demo online: https://whatlang.org/

62
Established
94 titipata/pubmed_parser

:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset

62
Established
95 loretoparisi/fasttext.js

FastText for Node.js

62
Established
96 guillaume-be/rust-tokenizers

Rust-tokenizer offers high-performance tokenizers for modern language...

61
Established
97 chakki-works/seqeval

A Python framework for sequence labeling evaluation(named-entity...

61
Established
98 fhamborg/news-please

news-please - an integrated web crawler and information extractor for news...

61
Established
99 wikimedia/sentencex

A sentence segmentation library with wide language support optimized for...

61
Established
100 Tiiiger/bert_score

BERT score for text generation

61
Established
1 2 3 134 135 136 Next »