wchan757/Cantonese_Word_Segmentation
Dictionary for Cantonese word segmentation
This dictionary helps anyone working with Cantonese text accurately break sentences into meaningful words. It takes raw Cantonese sentences and outputs a list of individual words, which is crucial for tasks like search, analysis, or translation. It's for linguists, data analysts, or anyone processing Cantonese language data.
No commits in the last 6 months.
Use this if you need to improve the accuracy of word segmentation for Cantonese text.
Not ideal if your primary focus is on Mandarin Chinese or other languages, as this dictionary is specifically for Cantonese.
Stars
38
Forks
5
Language
—
License
—
Category
Last pushed
Jun 04, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/wchan757/Cantonese_Word_Segmentation"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PyThaiNLP/pythainlp
Thai natural language processing in Python
hankcs/HanLP
Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named...
jacksonllee/pycantonese
Cantonese Linguistics and NLP
dongrixinyu/JioNLP
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
hankcs/pyhanlp
中文分词