khmerlang/elasticsearch-analysis-khmerlang
Khmer Analysis Plugin for Elasticsearch
This tool helps organize and prepare Khmer language text for efficient searching and analysis. It takes raw Khmer text, processes it to correct common character ordering and usage errors, segments it into individual words, handles Khmer numbers, and applies synonyms, outputting a structured list of searchable terms. This is ideal for anyone working with large volumes of Khmer text, such as linguists, researchers, or data analysts.
No commits in the last 6 months.
Use this if you need to accurately process and search through Khmer language documents in a database or search engine.
Not ideal if your primary goal is real-time translation or highly nuanced natural language understanding beyond basic text segmentation and standardization.
Stars
20
Forks
2
Language
Java
License
GPL-3.0
Category
Last pushed
Jul 02, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/khmerlang/elasticsearch-analysis-khmerlang"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
VietHoang1512/khmer-nltk
Khmer language processing toolkit
PyThaiNLP/attacut
A Fast and Accurate Neural Thai Word Segmenter
UlugbekSalaev/UzTransliterator
UzTransliterator | State-of-the-art machine transliteration tool for Uzbek language
seanghay/KhmerOCR
A Fast Khmer Optical Character Recognition (KhmerOCR)
seanghay/khmerphonemizer
A Free, Standalone and Open-Source Khmer Grapheme-to-Phonemes.