houbb/segment
The jieba-analysis tool for java.(基于结巴分词词库实现的更加灵活优雅易用,高性能的 java 分词实现。支持词性标注。)
This tool helps analyze Chinese text by breaking sentences into individual words or phrases, a process called word segmentation. You feed it raw Chinese text, and it outputs lists of segmented words, often with their part-of-speech, or word counts. It's designed for anyone working with Chinese language data who needs to prepare it for further analysis, like a linguist, market researcher, or data analyst.
156 stars. No commits in the last 6 months.
Use this if you need a flexible and high-performance way to segment Chinese text and potentially tag words with their grammatical roles for natural language processing.
Not ideal if your primary need is for English or other non-Chinese language text analysis, as this tool is specifically designed for Chinese.
Stars
156
Forks
29
Language
Java
License
Apache-2.0
Category
Last pushed
Feb 28, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/houbb/segment"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PyThaiNLP/pythainlp
Thai natural language processing in Python
hankcs/HanLP
Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named...
jacksonllee/pycantonese
Cantonese Linguistics and NLP
dongrixinyu/JioNLP
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
hankcs/pyhanlp
中文分词