Rust NLP Bindings NLP Tools
Rust implementations of NLP libraries with language bindings (Python, Node.js, etc.), and Rust-based NLP tools designed for interoperability. Does NOT include language-specific NLP tools, application-focused projects, or pure Python/JavaScript libraries.
There are 127 rust nlp bindings tools tracked. 9 score above 50 (established tier). The highest-rated is PyThaiNLP/nlpo3 at 69/100 with 42 stars and 1,220 monthly downloads.
Get all 127 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=rust-nlp-bindings&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
PyThaiNLP/nlpo3
Thai natural language processing library in Rust, with Python and Node bindings. |
|
Established |
| 2 |
forzagreen/n2words
Convert numerical numbers to written numbers, in 52+ languages. |
|
Established |
| 3 |
greyblake/whatlang-rs
Natural language detection library for Rust. Try demo online: https://whatlang.org/ |
|
Established |
| 4 |
wikimedia/sentencex
A sentence segmentation library with wide language support optimized for... |
|
Established |
| 5 |
pemistahl/lingua-rs
The most accurate natural language detection library for Rust, suitable for... |
|
Established |
| 6 |
quickwit-oss/whichlang
A blazingly fast and lightweight language detection library for Rust |
|
Established |
| 7 |
fbilhaut/gline-rs
Inference engine for GLiNER models, in Rust |
|
Established |
| 8 |
openvenues/pypostal
Python bindings to libpostal for fast international address parsing/normalization |
|
Established |
| 9 |
jaidevd/numerizer
A Python module to convert natural language numerics into ints and floats. |
|
Established |
| 10 |
messense/fasttext-rs
fastText Rust binding |
|
Emerging |
| 11 |
akshaynagpal/w2n
Convert number words (eg. twenty one) to numeric digits (21) |
|
Emerging |
| 12 |
cmccomb/rust-stop-words
Common stop words in a variety of languages |
|
Emerging |
| 13 |
joshrotenberg/lingua_ex
An Elixir wrapper around the Rust Lingua language detection library. |
|
Emerging |
| 14 |
yohasebe/engtagger
English Part-of-Speech Tagger Library; a Ruby port of Lingua::EN::Tagger |
|
Emerging |
| 15 |
proycon/analiticcl
an approximate string matching or fuzzy-matching system for spelling... |
|
Emerging |
| 16 |
djc/instant-segment
Fast English word segmentation in Rust |
|
Emerging |
| 17 |
allo-media/text2num-rs
Parse and convert numbers written in English, Dutch, Spanish, German,... |
|
Emerging |
| 18 |
openvenues/node-postal
NodeJS bindings to libpostal for fast international address parsing/normalization |
|
Emerging |
| 19 |
minibikini/paasaa
🔤 Natural language detection for Elixir without AI |
|
Emerging |
| 20 |
abitdodgy/words_counted
A Ruby natural language processor. |
|
Emerging |
| 21 |
shnewto/ttaw
a piecemeal natural language processing library |
|
Emerging |
| 22 |
cadmiumcr/cadmium
Natural Language Processing (NLP) library for Crystal |
|
Emerging |
| 23 |
ticki/eudex
A blazingly fast phonetic reduction/hashing algorithm. |
|
Emerging |
| 24 |
snipsco/snips-nlu-parsers
Rust crate for entity parsing |
|
Emerging |
| 25 |
eklem/words-n-numbers
Tokenizing strings of text. Regex extracting arrays of words and optionally... |
|
Emerging |
| 26 |
rth/vtext
Simple NLP in Rust with Python bindings |
|
Emerging |
| 27 |
doppio/word2num
A Python package for converting numbers expressed in natural language to... |
|
Emerging |
| 28 |
kensho-technologies/sequence_align
Efficient implementations of Needleman-Wunsch and other sequence alignment... |
|
Emerging |
| 29 |
stellanomia/uroman-rs
A self-contained Rust reimplementation of uroman, a universal romanizer. |
|
Emerging |
| 30 |
kampersanda/tongrams-rs
Rust library providing fast language model queries in compressed space |
|
Emerging |
| 31 |
Lips7/Matcher
A high-performance matcher designed to solve LOGICAL and TEXT VARIATIONS... |
|
Emerging |
| 32 |
cpcdoy/rust-sbert
Rust port of sentence-transformers (https://github.com/UKPLab/sentence-transformers) |
|
Emerging |
| 33 |
zxc7563598/php-address-parser
收货地址智能解析工具,支持从非结构化文本中提取姓名、手机号、身份证号、省市区、详细地址等字段,适用于电商、物流、CRM 等系统 | An... |
|
Emerging |
| 34 |
rwalk/gsdmm-rust
GSDMM: Short text clustering (Rust implementation) |
|
Emerging |
| 35 |
unfoldingWord/string-punctuation-tokenizer
Small library that provides functions to tokenize a string into an array of... |
|
Emerging |
| 36 |
proycon/lingua-cli
Very small simple command-line interface for language detection using lingua-rs |
|
Emerging |
| 37 |
amokan/human_name
Elixir bindings for the human-name crate implemented as a safe Rust NIF |
|
Emerging |
| 38 |
arbox/wlapi
Ruby based API for the project Wortschatz Leipzig. |
|
Emerging |
| 39 |
dominictarro/semchunk-rs
A fast and lightweight Rust library for splitting text into semantically... |
|
Emerging |
| 40 |
sebpuetz/lumberjack
Read and modify constituency trees in Rust. |
|
Emerging |
| 41 |
gjtorikian/what_you_say
Natural language detection library. Written in Rust, wrapped in Ruby. |
|
Emerging |
| 42 |
victoryosiobe/kingchop
Kingchop ⚔️ is a JavaScript English based library for tokenizing text... |
|
Emerging |
| 43 |
victor-iyi/sage
Software stack powering Project: "Enhancing Human Intelligence" |
|
Emerging |
| 44 |
annotation/stam-rust
Programming library for the Standoff Text Annotation Model (STAM), written... |
|
Emerging |
| 45 |
dluman/rusTy
Rust bindings for the spaCy library. |
|
Emerging |
| 46 |
proycon/folia-rust
FoLiA library for rust (alpha) |
|
Emerging |
| 47 |
kimryan/StreetAddressParser
extract components of a street address from free format text |
|
Emerging |
| 48 |
rodaine/numwords
Go package to convert natural language strings to numbers |
|
Emerging |
| 49 |
famished-tiger/Rley
An Earley parser written in Ruby |
|
Emerging |
| 50 |
antononcube/Raku-Lingua-NumericWordForms
Raku functions that generate, parse, and interpret numeric word forms in... |
|
Emerging |
| 51 |
sjmielke/ptb-reader-rust
Simple parsing of the merged Penn Treebank format. |
|
Emerging |
| 52 |
fbilhaut/gliclass-rs
GLiClass inferences in Rust |
|
Emerging |
| 53 |
holbewoner/yn.rs
Natural language processing library for yes or no values written in Rust |
|
Emerging |
| 54 |
mladvladimir/rust-sentence-transformers
Rust port of https://github.com/UKPLab/sentence-transformers |
|
Emerging |
| 55 |
vgel/treebender
A HDPSG-inspired symbolic natural language parser written in Rust |
|
Emerging |
| 56 |
stdlib-js/nlp-tokenize
Tokenize a string. |
|
Emerging |
| 57 |
amake/srx-ruby
An SRX segmenting engine for Ruby |
|
Emerging |
| 58 |
jacksonllee/rustling
A blazingly fast library for computational linguistics |
|
Emerging |
| 59 |
monorkin/witty
wit.ai client library for Rust |
|
Emerging |
| 60 |
o24s/haqumei
A Japanese Grapheme-to-Phoneme (G2P) library. |
|
Emerging |
| 61 |
despawnerer/truecase
Restore correct letter casings in arbitrary text using a statistical model |
|
Emerging |
| 62 |
uetchy/homebrew-nlp
🍺 a Homebrew keg that specialized in Natural Language Processing. |
|
Emerging |
| 63 |
chrovis/parattice
Recursive paraphrase lattice generator |
|
Emerging |
| 64 |
RapidappsIT/uaddress
🇺🇦 UAddress | NLP Парсер украинских адресов |
|
Emerging |
| 65 |
omarmhaimdat/whatlang-pyo3
Python Binding for Rust WhatLang, a language detection library |
|
Emerging |
| 66 |
mrseanryan/in_definite
:a: Rust port of 'npm indefinite' for deciding which indefinite article to... |
|
Emerging |
| 67 |
ashvardanian/HashEvals
Minimalistic Rust toolkit for hash function quality analysis. Tests... |
|
Experimental |
| 68 |
antononcube/Raku-Lingua-StopwordsISO
Raku package for stop words of different languages and stop words deletion.... |
|
Experimental |
| 69 |
russianwordnet/yarn
Yet Another RussNet |
|
Experimental |
| 70 |
niclaslind/shorelark
Rust + NLP + WASM |
|
Experimental |
| 71 |
Yuzufi/word-freq-statistic
盲分词的高性能中文语料词频统计工具:1分钟内统计10亿字语料的2字词! |
|
Experimental |
| 72 |
Tiphereth-A/tdector
A lingustics tool: mark words & find similar sentences |
|
Experimental |
| 73 |
joh-ga/RubyCrumbler
A simple Ruby script that contains a GUI desktop application providing... |
|
Experimental |
| 74 |
alordash/parse-word-to-number
Extracts numbers written as words from string. |
|
Experimental |
| 75 |
MeltwaterArchive/ex_lsh
A configurable implementation of locality-sensitive hashing in Elixir |
|
Experimental |
| 76 |
dhchenx/rsnltk
Rust-based Natural Language Toolkit using Python Bindings |
|
Experimental |
| 77 |
ZJaume/heliport
Fast and accurate language identifier |
|
Experimental |
| 78 |
7086cmd/iris
A universal code translator based on intermediate representations. |
|
Experimental |
| 79 |
talmago/fast_gliner
Python bindings to Inference engine for GLiNER models written in Rust |
|
Experimental |
| 80 |
messense/bosonnlp-rs
BosonNLP SDK for Rust |
|
Experimental |
| 81 |
annotation/stam-python
Python binding to work with STAM, the Standoff Text Annotation Model, from... |
|
Experimental |
| 82 |
LdDl/langdetect-rs
Language detection in Rust. Port of Mimino666's langdetect. |
|
Experimental |
| 83 |
VitalinaZlo/VolgaIT-2024_AI
Проект направлен на разработку алгоритма для автоматического распознавания... |
|
Experimental |
| 84 |
Miezhiko/Kathoey
Rust library for text feminization using open corpus linguistics data |
|
Experimental |
| 85 |
Flight-School/sentences
A command-line utility that splits natural language text into sentences. |
|
Experimental |
| 86 |
cgbur/croppy
Batch auto-crop tool for scanned film negatives in Lightroom |
|
Experimental |
| 87 |
patrols/ruby_llm-text
ActiveSupport-style Ruby gem for LLM text operations: summarize, translate,... |
|
Experimental |
| 88 |
Lambda-Logan/creature_feature
Composable n-gram combinators that are ergonomic and bare-metal fast |
|
Experimental |
| 89 |
Aljutor/yurki
Fast NLP tools for Python |
|
Experimental |
| 90 |
frankier/opus-parse
This Rust library can parse OPUS's monolingual XML files. |
|
Experimental |
| 91 |
justi/price_scanner
Battle-tested multi-currency price extraction from text. Supports PLN, EUR,... |
|
Experimental |
| 92 |
Mango-Cats/tagabaybay
Orthographic nativization for Filipino loanwords. |
|
Experimental |
| 93 |
parhamr/nlp-pure
Natural language processing algorithms implemented in pure Ruby with minimal... |
|
Experimental |
| 94 |
TangoJP/rust_vectorizer
Practice Rust by making Vectorizer |
|
Experimental |
| 95 |
ademakdogan/hyperfuzz
Blazing-fast string similarity library written in Rust with Python bindings.... |
|
Experimental |
| 96 |
abitdodgy/gibran
Gibran is an Elixir natural language processor, and a port of WordsCounted. |
|
Experimental |
| 97 |
roloza7/sstn
Super Simple Text Normalizer in Rust with SIMD for x86 |
|
Experimental |
| 98 |
nathankleyn/ruby-nlp
Various NLP tools for Ruby |
|
Experimental |
| 99 |
scurkovic/cutters
A rule based sentence segmentation library. |
|
Experimental |
| 100 |
IsiXhosa-click/isixhosa
A library to help process text in isiXhosa for Rust |
|
Experimental |
| 101 |
GarthTB/WordFreqCounter
盲分词的中文语料词频统计器 |
|
Experimental |
| 102 |
GarthTB/word-freq-statistic
盲分词的高性能中文语料词频统计工具:1分钟内统计10亿字语料的2字词! |
|
Experimental |
| 103 |
shubham0204/postagger.rs
NLTK inspired Parts-of-Speech Tagger (Perceptron Tagger) in Rust |
|
Experimental |
| 104 |
pymorphy2-fork/morphrs-py
Experimental morph-rs bindings for Python. |
|
Experimental |
| 105 |
HectorPulido/human-language-toolkit-chatbot
nltk like chatbot in rust |
|
Experimental |
| 106 |
xamgore/segtok
A rule-based sentence segmenter (splitter) and a word tokenizer using... |
|
Experimental |
| 107 |
yuanzhoulvpi2017/Rust4SenVec
convert sentence to vector by nlp transformers model in Rust |
|
Experimental |
| 108 |
sigrlami/lanhunch
Language Detection Library |
|
Experimental |
| 109 |
proycon/lexmatch
Simple lexicon matcher against a text |
|
Experimental |
| 110 |
loony-bean/stopwords-rs
Stopwords from popular text processing frameworks |
|
Experimental |
| 111 |
arclabs561/phrasegen
Phrase generation and extraction |
|
Experimental |
| 112 |
joshrotenberg/unimorph-rs
A Rust toolkit for working with UniMorph morphological data |
|
Experimental |
| 113 |
cliftontoaster-reid/wit_owo
A Rust library for the Wit.ai API |
|
Experimental |
| 114 |
arclabs561/textprep
Text preprocessing primitives: normalization, tokenization, and fast keyword... |
|
Experimental |
| 115 |
allenai/rustberta-snli
🦀 A Rust implementation of a RoBERTa classification model for the SNLI dataset |
|
Experimental |
| 116 |
Adityagupta-dev/Indian-Address-Parser
The Indian Address Parser is an advanced Natural Language Processing (NLP)... |
|
Experimental |
| 117 |
Chubek/upsc3ne
An obscenity detection API in Rust using Custom Implementations |
|
Experimental |
| 118 |
chriamue/bert-cli
CLI for rust bert |
|
Experimental |
| 119 |
dcavar/rust-tutorial-notebooks
Rust tutorials for NLP and AI |
|
Experimental |
| 120 |
gembleman/bareun_rs
bareun-rs is an unofficial Rust library for Bareun, a Korean morphological analyzer. |
|
Experimental |
| 121 |
eliangonde/langid
Rust implementation of the langid library for language identification.... |
|
Experimental |
| 122 |
martinjack/uaddresspacy
🇺🇦 UAddresspacy | Spacy разборка украинского адреса на типы |
|
Experimental |
| 123 |
wrnrlr/nlpg
NLP for Postgres |
|
Experimental |
| 124 |
Bbeierle12/Word-Slush
Word frequency analyzer for Claude conversation exports |
|
Experimental |
| 125 |
jesper-olsen/glove-rs
Rusty GloVe - compute GloVe word embeddings for a text corpus. |
|
Experimental |
| 126 |
loony-bean/lda-rs
Experimenting with LDA in Rust |
|
Experimental |
| 127 |
TheOpenDictionary/ttb
A lightning-fast tool for querying Tatoebe from the command-line ⚡ |
|
Experimental |