Rust NLP Bindings NLP Tools

Rust implementations of NLP libraries with language bindings (Python, Node.js, etc.), and Rust-based NLP tools designed for interoperability. Does NOT include language-specific NLP tools, application-focused projects, or pure Python/JavaScript libraries.

There are 127 rust nlp bindings tools tracked. 9 score above 50 (established tier). The highest-rated is PyThaiNLP/nlpo3 at 69/100 with 42 stars and 1,220 monthly downloads.

Get all 127 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=rust-nlp-bindings&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 PyThaiNLP/nlpo3

Thai natural language processing library in Rust, with Python and Node bindings.

69
Established
2 forzagreen/n2words

Convert numerical numbers to written numbers, in 52+ languages.

67
Established
3 greyblake/whatlang-rs

Natural language detection library for Rust. Try demo online: https://whatlang.org/

62
Established
4 wikimedia/sentencex

A sentence segmentation library with wide language support optimized for...

61
Established
5 pemistahl/lingua-rs

The most accurate natural language detection library for Rust, suitable for...

60
Established
6 quickwit-oss/whichlang

A blazingly fast and lightweight language detection library for Rust

55
Established
7 fbilhaut/gline-rs

Inference engine for GLiNER models, in Rust

55
Established
8 openvenues/pypostal

Python bindings to libpostal for fast international address parsing/normalization

51
Established
9 jaidevd/numerizer

A Python module to convert natural language numerics into ints and floats.

50
Established
10 messense/fasttext-rs

fastText Rust binding

49
Emerging
11 akshaynagpal/w2n

Convert number words (eg. twenty one) to numeric digits (21)

49
Emerging
12 cmccomb/rust-stop-words

Common stop words in a variety of languages

48
Emerging
13 joshrotenberg/lingua_ex

An Elixir wrapper around the Rust Lingua language detection library.

48
Emerging
14 yohasebe/engtagger

English Part-of-Speech Tagger Library; a Ruby port of Lingua::EN::Tagger

47
Emerging
15 proycon/analiticcl

an approximate string matching or fuzzy-matching system for spelling...

47
Emerging
16 djc/instant-segment

Fast English word segmentation in Rust

46
Emerging
17 allo-media/text2num-rs

Parse and convert numbers written in English, Dutch, Spanish, German,...

46
Emerging
18 openvenues/node-postal

NodeJS bindings to libpostal for fast international address parsing/normalization

46
Emerging
19 minibikini/paasaa

🔤 Natural language detection for Elixir without AI

45
Emerging
20 abitdodgy/words_counted

A Ruby natural language processor.

45
Emerging
21 shnewto/ttaw

a piecemeal natural language processing library

45
Emerging
22 cadmiumcr/cadmium

Natural Language Processing (NLP) library for Crystal

43
Emerging
23 ticki/eudex

A blazingly fast phonetic reduction/hashing algorithm.

43
Emerging
24 snipsco/snips-nlu-parsers

Rust crate for entity parsing

42
Emerging
25 eklem/words-n-numbers

Tokenizing strings of text. Regex extracting arrays of words and optionally...

41
Emerging
26 rth/vtext

Simple NLP in Rust with Python bindings

41
Emerging
27 doppio/word2num

A Python package for converting numbers expressed in natural language to...

41
Emerging
28 kensho-technologies/sequence_align

Efficient implementations of Needleman-Wunsch and other sequence alignment...

40
Emerging
29 stellanomia/uroman-rs

A self-contained Rust reimplementation of uroman, a universal romanizer.

40
Emerging
30 kampersanda/tongrams-rs

Rust library providing fast language model queries in compressed space

40
Emerging
31 Lips7/Matcher

A high-performance matcher designed to solve LOGICAL and TEXT VARIATIONS...

40
Emerging
32 cpcdoy/rust-sbert

Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers)

39
Emerging
33 zxc7563598/php-address-parser

收货地址智能解析工具,支持从非结构化文本中提取姓名、手机号、身份证号、省市区、详细地址等字段,适用于电商、物流、CRM 等系统 | An...

39
Emerging
34 rwalk/gsdmm-rust

GSDMM: Short text clustering (Rust implementation)

39
Emerging
35 unfoldingWord/string-punctuation-tokenizer

Small library that provides functions to tokenize a string into an array of...

38
Emerging
36 proycon/lingua-cli

Very small simple command-line interface for language detection using lingua-rs

38
Emerging
37 amokan/human_name

Elixir bindings for the human-name crate implemented as a safe Rust NIF

38
Emerging
38 arbox/wlapi

Ruby based API for the project Wortschatz Leipzig.

38
Emerging
39 dominictarro/semchunk-rs

A fast and lightweight Rust library for splitting text into semantically...

38
Emerging
40 sebpuetz/lumberjack

Read and modify constituency trees in Rust.

37
Emerging
41 gjtorikian/what_you_say

Natural language detection library. Written in Rust, wrapped in Ruby.

37
Emerging
42 victoryosiobe/kingchop

Kingchop ⚔️ is a JavaScript English based library for tokenizing text...

36
Emerging
43 victor-iyi/sage

Software stack powering Project: "Enhancing Human Intelligence"

36
Emerging
44 annotation/stam-rust

Programming library for the Standoff Text Annotation Model (STAM), written...

35
Emerging
45 dluman/rusTy

Rust bindings for the spaCy library.

35
Emerging
46 proycon/folia-rust

FoLiA library for rust (alpha)

35
Emerging
47 kimryan/StreetAddressParser

extract components of a street address from free format text

34
Emerging
48 rodaine/numwords

Go package to convert natural language strings to numbers

34
Emerging
49 famished-tiger/Rley

An Earley parser written in Ruby

33
Emerging
50 antononcube/Raku-Lingua-NumericWordForms

Raku functions that generate, parse, and interpret numeric word forms in...

33
Emerging
51 sjmielke/ptb-reader-rust

Simple parsing of the merged Penn Treebank format.

33
Emerging
52 fbilhaut/gliclass-rs

GLiClass inferences in Rust

33
Emerging
53 holbewoner/yn.rs

Natural language processing library for yes or no values written in Rust

32
Emerging
54 mladvladimir/rust-sentence-transformers

Rust port of https://github.com/UKPLab/sentence-transformers

32
Emerging
55 vgel/treebender

A HDPSG-inspired symbolic natural language parser written in Rust

32
Emerging
56 stdlib-js/nlp-tokenize

Tokenize a string.

32
Emerging
57 amake/srx-ruby

An SRX segmenting engine for Ruby

32
Emerging
58 jacksonllee/rustling

A blazingly fast library for computational linguistics

31
Emerging
59 monorkin/witty

wit.ai client library for Rust

31
Emerging
60 o24s/haqumei

A Japanese Grapheme-to-Phoneme (G2P) library.

31
Emerging
61 despawnerer/truecase

Restore correct letter casings in arbitrary text using a statistical model

31
Emerging
62 uetchy/homebrew-nlp

🍺 a Homebrew keg that specialized in Natural Language Processing.

30
Emerging
63 chrovis/parattice

Recursive paraphrase lattice generator

30
Emerging
64 RapidappsIT/uaddress

🇺🇦 UAddress | NLP Парсер украинских адресов

30
Emerging
65 omarmhaimdat/whatlang-pyo3

Python Binding for Rust WhatLang, a language detection library

30
Emerging
66 mrseanryan/in_definite

:a: Rust port of 'npm indefinite' for deciding which indefinite article to...

30
Emerging
67 ashvardanian/HashEvals

Minimalistic Rust toolkit for hash function quality analysis. Tests...

29
Experimental
68 antononcube/Raku-Lingua-StopwordsISO

Raku package for stop words of different languages and stop words deletion....

29
Experimental
69 russianwordnet/yarn

Yet Another RussNet

29
Experimental
70 niclaslind/shorelark

Rust + NLP + WASM

29
Experimental
71 Yuzufi/word-freq-statistic

盲分词的高性能中文语料词频统计工具:1分钟内统计10亿字语料的2字词!

29
Experimental
72 Tiphereth-A/tdector

A lingustics tool: mark words & find similar sentences

28
Experimental
73 joh-ga/RubyCrumbler

A simple Ruby script that contains a GUI desktop application providing...

28
Experimental
74 alordash/parse-word-to-number

Extracts numbers written as words from string.

27
Experimental
75 MeltwaterArchive/ex_lsh

A configurable implementation of locality-sensitive hashing in Elixir

27
Experimental
76 dhchenx/rsnltk

Rust-based Natural Language Toolkit using Python Bindings

26
Experimental
77 ZJaume/heliport

Fast and accurate language identifier

26
Experimental
78 7086cmd/iris

A universal code translator based on intermediate representations.

25
Experimental
79 talmago/fast_gliner

Python bindings to Inference engine for GLiNER models written in Rust

25
Experimental
80 messense/bosonnlp-rs

BosonNLP SDK for Rust

25
Experimental
81 annotation/stam-python

Python binding to work with STAM, the Standoff Text Annotation Model, from...

25
Experimental
82 LdDl/langdetect-rs

Language detection in Rust. Port of Mimino666's langdetect.

24
Experimental
83 VitalinaZlo/VolgaIT-2024_AI

Проект направлен на разработку алгоритма для автоматического распознавания...

23
Experimental
84 Miezhiko/Kathoey

Rust library for text feminization using open corpus linguistics data

23
Experimental
85 Flight-School/sentences

A command-line utility that splits natural language text into sentences.

23
Experimental
86 cgbur/croppy

Batch auto-crop tool for scanned film negatives in Lightroom

23
Experimental
87 patrols/ruby_llm-text

ActiveSupport-style Ruby gem for LLM text operations: summarize, translate,...

23
Experimental
88 Lambda-Logan/creature_feature

Composable n-gram combinators that are ergonomic and bare-metal fast

23
Experimental
89 Aljutor/yurki

Fast NLP tools for Python

23
Experimental
90 frankier/opus-parse

This Rust library can parse OPUS's monolingual XML files.

23
Experimental
91 justi/price_scanner

Battle-tested multi-currency price extraction from text. Supports PLN, EUR,...

23
Experimental
92 Mango-Cats/tagabaybay

Orthographic nativization for Filipino loanwords.

22
Experimental
93 parhamr/nlp-pure

Natural language processing algorithms implemented in pure Ruby with minimal...

22
Experimental
94 TangoJP/rust_vectorizer

Practice Rust by making Vectorizer

22
Experimental
95 ademakdogan/hyperfuzz

Blazing-fast string similarity library written in Rust with Python bindings....

22
Experimental
96 abitdodgy/gibran

Gibran is an Elixir natural language processor, and a port of WordsCounted.

22
Experimental
97 roloza7/sstn

Super Simple Text Normalizer in Rust with SIMD for x86

21
Experimental
98 nathankleyn/ruby-nlp

Various NLP tools for Ruby

21
Experimental
99 scurkovic/cutters

A rule based sentence segmentation library.

21
Experimental
100 IsiXhosa-click/isixhosa

A library to help process text in isiXhosa for Rust

20
Experimental
101 GarthTB/WordFreqCounter

盲分词的中文语料词频统计器

20
Experimental
102 GarthTB/word-freq-statistic

盲分词的高性能中文语料词频统计工具:1分钟内统计10亿字语料的2字词!

20
Experimental
103 shubham0204/postagger.rs

NLTK inspired Parts-of-Speech Tagger (Perceptron Tagger) in Rust

20
Experimental
104 pymorphy2-fork/morphrs-py

Experimental morph-rs bindings for Python.

20
Experimental
105 HectorPulido/human-language-toolkit-chatbot

nltk like chatbot in rust

20
Experimental
106 xamgore/segtok

A rule-based sentence segmenter (splitter) and a word tokenizer using...

20
Experimental
107 yuanzhoulvpi2017/Rust4SenVec

convert sentence to vector by nlp transformers model in Rust

20
Experimental
108 sigrlami/lanhunch

Language Detection Library

20
Experimental
109 proycon/lexmatch

Simple lexicon matcher against a text

20
Experimental
110 loony-bean/stopwords-rs

Stopwords from popular text processing frameworks

19
Experimental
111 arclabs561/phrasegen

Phrase generation and extraction

19
Experimental
112 joshrotenberg/unimorph-rs

A Rust toolkit for working with UniMorph morphological data

19
Experimental
113 cliftontoaster-reid/wit_owo

A Rust library for the Wit.ai API

19
Experimental
114 arclabs561/textprep

Text preprocessing primitives: normalization, tokenization, and fast keyword...

19
Experimental
115 allenai/rustberta-snli

🦀 A Rust implementation of a RoBERTa classification model for the SNLI dataset

19
Experimental
116 Adityagupta-dev/Indian-Address-Parser

The Indian Address Parser is an advanced Natural Language Processing (NLP)...

18
Experimental
117 Chubek/upsc3ne

An obscenity detection API in Rust using Custom Implementations

18
Experimental
118 chriamue/bert-cli

CLI for rust bert

17
Experimental
119 dcavar/rust-tutorial-notebooks

Rust tutorials for NLP and AI

17
Experimental
120 gembleman/bareun_rs

bareun-rs is an unofficial Rust library for Bareun, a Korean morphological analyzer.

15
Experimental
121 eliangonde/langid

Rust implementation of the langid library for language identification....

14
Experimental
122 martinjack/uaddresspacy

🇺🇦 UAddresspacy | Spacy разборка украинского адреса на типы

12
Experimental
123 wrnrlr/nlpg

NLP for Postgres

12
Experimental
124 Bbeierle12/Word-Slush

Word frequency analyzer for Claude conversation exports

11
Experimental
125 jesper-olsen/glove-rs

Rusty GloVe - compute GloVe word embeddings for a text corpus.

11
Experimental
126 loony-bean/lda-rs

Experimenting with LDA in Rust

11
Experimental
127 TheOpenDictionary/ttb

A lightning-fast tool for querying Tatoebe from the command-line ⚡

10
Experimental