seanghay/KhmerOCR
A Fast Khmer Optical Character Recognition (KhmerOCR)
This tool helps individuals and organizations convert scanned Khmer documents, images, or PDFs into editable text formats like Word, HTML, Markdown, or plain text. It accurately recognizes Khmer script, detects different font styles (like Moul vs. Regular), and preserves document layouts. Anyone working with physical or digital Khmer documents who needs to extract and edit their content would find this useful.
Use this if you need to quickly and accurately convert images or PDF documents containing Khmer script into editable digital text.
Not ideal if your documents contain English or other non-Khmer languages, as those are not currently supported.
Stars
48
Forks
9
Language
C++
License
MIT
Category
Last pushed
Feb 12, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/seanghay/KhmerOCR"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
VietHoang1512/khmer-nltk
Khmer language processing toolkit
PyThaiNLP/attacut
A Fast and Accurate Neural Thai Word Segmenter
UlugbekSalaev/UzTransliterator
UzTransliterator | State-of-the-art machine transliteration tool for Uzbek language
seanghay/khmerphonemizer
A Free, Standalone and Open-Source Khmer Grapheme-to-Phonemes.
ionite34/Aquila-Resolve
Augmented Recurrent Neural Grapheme-to-Phoneme conversion with Inflectional Orthography.