jiaeyan/Jiayan

甲言,专注于古代汉语(古汉语/古文/文言文/文言)处理的NLP工具包,支持文言词库构建、分词、词性标注、断句和标点。Jiayan, the 1st NLP toolkit designed for Classical Chinese, supports lexicon construction, tokenizing, POS tagging, sentence segmentation and punctuation.

53
/ 100
Established

This tool helps classical Chinese scholars and enthusiasts automatically process ancient texts. It takes raw classical Chinese text as input and can generate specialized vocabulary lists, break text into individual words, assign grammatical categories to words, identify sentence boundaries, and add modern punctuation. This is ideal for researchers, linguists, or anyone analyzing large volumes of classical Chinese literature who needs precise text segmentation and annotation.

659 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to accurately segment, annotate, and punctuate classical Chinese texts for linguistic analysis or digital humanities projects.

Not ideal if your primary interest is modern Chinese text processing, as this tool is specifically designed and optimized for classical Chinese.

classical-chinese-studies digital-humanities ancient-texts-analysis linguistic-annotation historical-linguistics
Stale 6m No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 18 / 25

How are scores calculated?

Stars

659

Forks

71

Language

Python

License

MIT

Last pushed

Nov 02, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/jiaeyan/Jiayan"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.