kotaro-kinoshita/yomitoku
YomiTokuはAIを活用した日本語文書解析エンジンを提供するPythonパッケージです。 Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
This tool helps Japanese businesses and researchers convert scanned or image-based Japanese documents, like reports or forms, into editable text. It takes images of documents, including those with handwriting or complex layouts, and outputs structured text in formats like HTML, Markdown, JSON, or CSV. Anyone needing to extract and organize information from Japanese document images for analysis or record-keeping would find this useful.
1,356 stars. Actively maintained with 10 commits in the last 30 days.
Use this if you need to accurately extract text and layout information from Japanese document images, including specialized layouts like vertical writing or tables, and export it into structured, machine-readable formats.
Not ideal if your primary need is to read text from signs or other non-document images, or if you consistently work with very low-resolution images.
Stars
1,356
Forks
51
Language
Python
License
—
Category
Last pushed
Mar 13, 2026
Commits (30d)
10
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/kotaro-kinoshita/yomitoku"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
JaidedAI/EasyOCR
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin,...
breezedeus/CnSTD
CnSTD: 基于 PyTorch/MXNet 的 中文/英文 场景文字检测(Scene Text Detection)、数学公式检测(Mathematical Formula...
githubharald/SimpleHTR
Handwritten Text Recognition (HTR) system implemented with TensorFlow.
felixdittrich92/OnnxTR
OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless,...
mindee/doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for...