kevincobain2000/jProcessing
Japanese Natural Langauge Processing Libraries
This helps people working with Japanese text by breaking down sentences into individual words, converting text to its phonetic Katakana or Romaji pronunciations, and finding similarities between Japanese phrases. It takes raw Japanese text as input and produces structured linguistic information or phonetic representations, making it useful for linguists, language learners, or data analysts processing Japanese content.
148 stars. No commits in the last 6 months.
Use this if you need to analyze Japanese text at a granular level, convert Japanese characters to their phonetic spellings, or compare the similarity of different Japanese sentences.
Not ideal if your primary need is for advanced machine translation, speech recognition, or complex conversational AI in Japanese, as it focuses on foundational linguistic processing.
Stars
148
Forks
30
Language
OpenEdge ABL
License
BSD-2-Clause
Category
Last pushed
Sep 09, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/kevincobain2000/jProcessing"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
EmilStenstrom/conllu
A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.
OpenPecha/Botok
🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python
taishi-i/nagisa
A Japanese tokenizer based on recurrent neural networks
zaemyung/sentsplit
A flexible sentence segmentation library using CRF model and regex rules
natasha/razdel
Rule-based token, sentence segmentation for Russian language