yihong-chen/chinese-word-segmentation

Simple chinese word segmentation with experiments on the PKU datatset

20
/ 100
Experimental

This tool helps linguists, data scientists, and researchers accurately break down Chinese text into individual words. You input raw Chinese sentences or documents, and it outputs the text with clear word boundaries, which is crucial for further analysis like natural language processing or text mining. It's designed for anyone needing to pre-process Chinese text for computational tasks.

No commits in the last 6 months.

Use this if you need to reliably segment Chinese text into words for linguistic analysis, search indexing, or other text processing applications.

Not ideal if you require extremely high-performance real-time segmentation or are working with highly specialized jargon that might not be covered by standard models.

Chinese-NLP text-pre-processing linguistic-analysis data-science-text
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 8 / 25
Community 8 / 25

How are scores calculated?

Stars

8

Forks

1

Language

Jupyter Notebook

License

Last pushed

Apr 18, 2018

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/yihong-chen/chinese-word-segmentation"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.