rwalk/gsdmm-rust
GSDMM: Short text clustering (Rust implementation)
This tool helps organize short pieces of text, like social media posts, product reviews, or survey responses, into meaningful groups. You provide a list of short texts and a vocabulary, and it outputs files showing which texts belong to which group, the key words for each group, and the probability of each text belonging to each group. This is ideal for analysts, marketers, or researchers who need to quickly make sense of large volumes of unstructured short text.
No commits in the last 6 months.
Use this if you need to automatically categorize many short text documents without knowing the exact number of categories beforehand.
Not ideal if you're working with very long documents or if you need to classify documents into predefined, known categories.
Stars
24
Forks
8
Language
Rust
License
MIT
Category
Last pushed
Apr 26, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/rwalk/gsdmm-rust"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PyThaiNLP/nlpo3
Thai natural language processing library in Rust, with Python and Node bindings.
forzagreen/n2words
Convert numerical numbers to written numbers, in 52+ languages.
greyblake/whatlang-rs
Natural language detection library for Rust. Try demo online: https://whatlang.org/
wikimedia/sentencex
A sentence segmentation library with wide language support optimized for speed and utility.
pemistahl/lingua-rs
The most accurate natural language detection library for Rust, suitable for short text and...