fukuball/jieba-php
"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best PHP Chinese word segmentation module.
This tool helps analyze Chinese text by breaking sentences into individual words or phrases, which is crucial for search engines, content analysis, and linguistic research. It takes a Chinese text string as input and outputs a list of segmented words, optionally with their part-of-speech tags and relevance scores. Anyone working with Chinese language data, such as content managers, data analysts, or linguists, will find this useful for tasks like keyword extraction or sentiment analysis.
1,373 stars.
Use this if you need to accurately segment Chinese text for analysis, search indexing, or content understanding, especially when working with PHP applications.
Not ideal if you require the most advanced natural language processing results, as large language models (LLMs) may offer superior accuracy for some tasks.
Stars
1,373
Forks
258
Language
PHP
License
MIT
Category
Last pushed
Dec 16, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/fukuball/jieba-php"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
facebookresearch/stopes
A library for preparing data for machine translation research (monolingual preprocessing,...
Droidtown/ArticutAPI
API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到...
rkcosmos/deepcut
A Thai word tokenization library using Deep Neural Network
pytorch/text
Models, data loaders and abstractions for language processing, powered by PyTorch
jiesutd/NCRFpp
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER,...