zamgi/lingvo--TextSegmenter
Text segmentation into separate words using a simple unigram model and the Viterbi algorithm
This tool helps you break down long strings of characters, like those found in URLs, product codes, or poorly formatted text, into individual, meaningful words. It takes your unsegmented text and outputs a clearly separated sequence of words. Anyone who needs to process or analyze text that lacks proper spacing between words would find this useful, such as data analysts, researchers, or anyone working with text cleanup.
No commits in the last 6 months.
Use this if you have text where words are run together without spaces and you need to split them into discrete units for easier reading or analysis.
Not ideal if you need advanced natural language processing features beyond simple word segmentation, such as grammar correction or semantic understanding.
Stars
9
Forks
1
Language
C#
License
MIT
Category
Last pushed
Oct 10, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/zamgi/lingvo--TextSegmenter"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
wooorm/franc
Natural language detection
microsoft/Recognizers-Text
Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, date/time,...
winkjs/wink-pos-tagger
English Part-of-speech (POS) tagger
sillsdev/machine
Machine is a natural language processing library for .NET that is focused on providing tools for...
ayoungprogrammer/Lango
Language Lego