clovaai/synthtiger

Official Implementation of SynthTIGER (Synthetic Text Image Generator), ICDAR 2021

58
/ 100
Established

This tool generates highly realistic synthetic text images, which are crucial for training and evaluating Optical Character Recognition (OCR) models. You provide a list of words or sentences, along with fonts and image backgrounds, and it produces diverse text images and their corresponding labels. OCR researchers and developers use this to create large, varied datasets for improving their text recognition systems, especially when real-world data is scarce or expensive to annotate.

573 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to create a large, customized dataset of text images for training and benchmarking OCR models, especially for specific styles, languages, or challenging conditions.

Not ideal if you're looking for an off-the-shelf OCR solution, as this tool focuses solely on generating synthetic training data rather than performing text recognition itself.

OCR-training data-synthesis computer-vision machine-learning-datasets document-analysis
Stale 6m
Maintenance 0 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 23 / 25

How are scores calculated?

Stars

573

Forks

108

Language

Python

License

MIT

Last pushed

Jun 14, 2024

Commits (30d)

0

Dependencies

13

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/clovaai/synthtiger"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.