hankcs/multi-criteria-cws
Simple Solution for Multi-Criteria Chinese Word Segmentation
This tool helps researchers and computational linguists accurately segment Chinese text into words, which is crucial for natural language processing tasks. It takes raw Chinese text or pre-existing corpora as input and outputs segmented text, ready for further analysis or model training. This is ideal for anyone working on Chinese language data.
303 stars. No commits in the last 6 months.
Use this if you need to perform high-quality Chinese word segmentation for research, academic projects, or building NLP applications.
Not ideal if you need a simple, ready-to-use API for Chinese word segmentation without any setup, or if you don't have access to relevant Chinese corpora.
Stars
303
Forks
81
Language
Python
License
GPL-3.0
Category
Last pushed
Aug 12, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/hankcs/multi-criteria-cws"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PyThaiNLP/pythainlp
Thai natural language processing in Python
hankcs/HanLP
Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named...
jacksonllee/pycantonese
Cantonese Linguistics and NLP
dongrixinyu/JioNLP
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
hankcs/pyhanlp
中文分词