zamgi/lingvo--TextSegmenter

Text segmentation into separate words using a simple unigram model and the Viterbi algorithm

31
/ 100
Emerging

This tool helps you break down long strings of characters, like those found in URLs, product codes, or poorly formatted text, into individual, meaningful words. It takes your unsegmented text and outputs a clearly separated sequence of words. Anyone who needs to process or analyze text that lacks proper spacing between words would find this useful, such as data analysts, researchers, or anyone working with text cleanup.

No commits in the last 6 months.

Use this if you have text where words are run together without spaces and you need to split them into discrete units for easier reading or analysis.

Not ideal if you need advanced natural language processing features beyond simple word segmentation, such as grammar correction or semantic understanding.

text-analysis data-preprocessing text-cleanup data-quality information-extraction
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 8 / 25

How are scores calculated?

Stars

9

Forks

1

Language

C#

License

MIT

Last pushed

Oct 10, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/zamgi/lingvo--TextSegmenter"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.