junhewk/RmecabKo
RmecabKo: R wrapper for eunjeon project (mecab-ko)
This tool helps R users analyze Korean text by breaking down sentences into individual words and their grammatical roles. It takes raw Korean text as input and outputs structured information about words, their parts of speech, and even sentiment. It's ideal for data scientists, linguists, or researchers working with Korean language data.
No commits in the last 6 months.
Use this if you need to perform detailed linguistic analysis, such as morphological analysis, part-of-speech tagging, or N-gram tokenization on Korean text within the R environment.
Not ideal if you are working with languages other than Korean or prefer a text analysis environment outside of R.
Stars
10
Forks
4
Language
Shell
License
—
Category
Last pushed
Mar 26, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/junhewk/RmecabKo"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
EmilStenstrom/conllu
A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.
OpenPecha/Botok
🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python
taishi-i/nagisa
A Japanese tokenizer based on recurrent neural networks
zaemyung/sentsplit
A flexible sentence segmentation library using CRF model and regex rules
natasha/razdel
Rule-based token, sentence segmentation for Russian language