rth/vtext
Simple NLP in Rust with Python bindings
This project helps data scientists and machine learning engineers prepare text data for analysis. It takes raw text inputs and efficiently converts them into structured numerical data, such as tokenized words, word stems, or numerical matrices representing word counts. This processed data is then ready for use in machine learning models to analyze patterns in language.
154 stars and 113 monthly downloads. No commits in the last 6 months.
Use this if you need a high-performance toolkit to quickly process large volumes of text data for natural language processing tasks.
Not ideal if you require a very broad range of advanced NLP functionalities beyond basic tokenization, stemming, and text vectorization.
Stars
154
Forks
9
Language
Rust
License
Apache-2.0
Category
Last pushed
Jul 06, 2023
Monthly downloads
113
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/rth/vtext"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PyThaiNLP/nlpo3
Thai natural language processing library in Rust, with Python and Node bindings.
forzagreen/n2words
Convert numerical numbers to written numbers, in 52+ languages.
greyblake/whatlang-rs
Natural language detection library for Rust. Try demo online: https://whatlang.org/
wikimedia/sentencex
A sentence segmentation library with wide language support optimized for speed and utility.
pemistahl/lingua-rs
The most accurate natural language detection library for Rust, suitable for short text and...