azagniotov/solr-lucene-analyzer-sudachi
A Japanese morphological analyzer Sudachi as a Solr plugin.
This is a Solr plugin that helps search engineers and data professionals accurately search and analyze Japanese text. It improves upon the standard Japanese text processing in Solr by leveraging the Sudachi morphological analyzer. Input is raw Japanese text within a Solr search index, and the output is more precise search results and text analysis.
No commits in the last 6 months.
Use this if you need to improve the accuracy and relevance of Japanese text search and analysis within your Solr-powered applications.
Not ideal if your application primarily deals with languages other than Japanese, or if you are not using Solr for search.
Stars
9
Forks
1
Language
Java
License
Apache-2.0
Category
Last pushed
Aug 03, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/azagniotov/solr-lucene-analyzer-sudachi"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
EmilStenstrom/conllu
A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.
OpenPecha/Botok
🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python
zaemyung/sentsplit
A flexible sentence segmentation library using CRF model and regex rules
taishi-i/nagisa
A Japanese tokenizer based on recurrent neural networks
natasha/razdel
Rule-based token, sentence segmentation for Russian language