jsrpy/Chinese-NLP-Jieba
This is an introduction to Chinese words segmentation using Jieba.
This helps you break down Chinese text into individual words, which is a crucial first step for many types of language analysis. You provide raw Chinese sentences or documents, and it gives you a list of separated words. This is used by anyone working with Chinese language data, such as researchers, linguists, or data analysts.
No commits in the last 6 months.
Use this if you need to prepare Chinese text for analysis by accurately segmenting it into words.
Not ideal if you are working with languages other than Chinese or need more advanced NLP tasks like sentiment analysis without prior word segmentation.
Stars
14
Forks
1
Language
Jupyter Notebook
License
MIT
Category
Last pushed
May 31, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/jsrpy/Chinese-NLP-Jieba"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PyThaiNLP/pythainlp
Thai natural language processing in Python
hankcs/HanLP
Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named...
jacksonllee/pycantonese
Cantonese Linguistics and NLP
dongrixinyu/JioNLP
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
hankcs/pyhanlp
中文分词