RQLuo/MixTeX-DataHub
LaTeXDataHub is an open-source platform dedicated to the sharing and contribution of real-world LaTeX image datasets and their annotations, allows users to upload, download, and contribute to a growing collection of high-quality LaTeX datasets.
LaTeXDataHub helps researchers, educators, and data scientists gather and share high-quality LaTeX images and their corresponding text annotations. You can upload various types of LaTeX content, such as modern printed documents, handwritten notes, or even blackboard lectures, and in return, you get access to a growing collection of diverse LaTeX datasets. This platform is for anyone working with LaTeX who needs to build or improve tools for processing scientific documents, especially those involving optical character recognition (OCR).
No commits in the last 6 months.
Use this if you need to find specialized datasets of LaTeX images and their text representations for training AI models, or if you want to contribute your own LaTeX image data to a collaborative open-source project.
Not ideal if you are looking for a tool to directly convert your LaTeX images into text without needing to manage or contribute to datasets.
Stars
12
Forks
—
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Aug 13, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/RQLuo/MixTeX-DataHub"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ogkalu2/comic-translate
Desktop app for automatically translating comics - BDs, Manga, Manhwa, Fumetti and more in a...
naptha/tesseract.js
Pure Javascript OCR for more than 100 Languages 📖🎉🖥
mayocream/koharu
ML-powered manga translator, written in Rust.
tesseract-ocr/tesseract
Tesseract Open Source OCR Engine (main repository)
zyddnys/manga-image-translator
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/ (no longer working)