tetutaro/mecab_dictionaries

create various dictionaries for MeCab and MeCab CLI using fugashi

20
/ 100
Experimental

When performing Japanese natural language processing, you need specialized dictionaries to accurately split sentences into individual words. This project provides scripts to create ready-to-use Python packages of various MeCab dictionaries, including UniDic, IPA, and JUMAN dictionaries, optionally enhanced with NEologd. It's for data scientists or researchers who need precise Japanese text analysis.

No commits in the last 6 months.

Use this if you are a developer working on Japanese text analysis and need to quickly set up MeCab dictionaries within your Python environment, especially when using 'fugashi' or 'mecab-python3'.

Not ideal if you're looking for a pre-packaged application or a non-technical solution to analyze Japanese text without needing to build or manage dictionary resources yourself.

Japanese NLP Morpheme Analysis Text Processing Linguistics Information Extraction
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

8

Forks

Language

Python

License

MIT

Last pushed

Feb 19, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/tetutaro/mecab_dictionaries"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.