jiaeyan/Jiayan
甲言,专注于古代汉语(古汉语/古文/文言文/文言)处理的NLP工具包,支持文言词库构建、分词、词性标注、断句和标点。Jiayan, the 1st NLP toolkit designed for Classical Chinese, supports lexicon construction, tokenizing, POS tagging, sentence segmentation and punctuation.
This tool helps classical Chinese scholars and enthusiasts automatically process ancient texts. It takes raw classical Chinese text as input and can generate specialized vocabulary lists, break text into individual words, assign grammatical categories to words, identify sentence boundaries, and add modern punctuation. This is ideal for researchers, linguists, or anyone analyzing large volumes of classical Chinese literature who needs precise text segmentation and annotation.
659 stars. No commits in the last 6 months. Available on PyPI.
Use this if you need to accurately segment, annotate, and punctuate classical Chinese texts for linguistic analysis or digital humanities projects.
Not ideal if your primary interest is modern Chinese text processing, as this tool is specifically designed and optimized for classical Chinese.
Stars
659
Forks
71
Language
Python
License
MIT
Category
Last pushed
Nov 02, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/jiaeyan/Jiayan"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
PyThaiNLP/pythainlp
Thai natural language processing in Python
hankcs/HanLP
Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named...
jacksonllee/pycantonese
Cantonese Linguistics and NLP
dongrixinyu/JioNLP
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
hankcs/pyhanlp
中文分词