himkt/konoha

🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.

51
/ 100
Established

When you need to break down Japanese text into individual words or sentences for analysis, this tool helps you do it consistently. You provide raw Japanese text, and it outputs the text segmented into meaningful units, like words or phrases. This is designed for data scientists, linguists, or anyone working with Japanese text who needs to prepare it for further computational processing.

261 stars.

Use this if you need to reliably split Japanese text into words or sentences and want the flexibility to easily switch between different text segmentation methods.

Not ideal if you are looking for advanced natural language understanding features beyond basic text segmentation, such as sentiment analysis or named entity recognition.

Japanese-text-analysis NLP-preprocessing text-segmentation linguistics data-preparation
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

261

Forks

26

Language

Python

License

MIT

Last pushed

Mar 01, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/himkt/konoha"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.