microsoft/genalog

Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

51
/ 100
Established

This tool helps machine learning engineers and data scientists create realistic synthetic document images from plain text and HTML templates. It takes your text and layout designs, then applies various visual degradations to mimic scanned documents with noise, blur, and other imperfections. The output is a dataset of diverse document images that can be used for training and evaluating optical character recognition (OCR) models.

346 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to generate large, varied datasets of document images with controlled noise for training and testing OCR systems or document processing pipelines.

Not ideal if you're looking for an off-the-shelf OCR solution or simply want to extract text from existing images without needing to create synthetic data.

document-processing OCR-training synthetic-data-generation computer-vision ML-data-preparation
Stale 6m
Maintenance 0 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 16 / 25

How are scores calculated?

Stars

346

Forks

35

Language

Jupyter Notebook

License

MIT

Last pushed

Jan 18, 2024

Commits (30d)

0

Dependencies

16

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/microsoft/genalog"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.