Keyvanhardani/german-ocr
German-OCR is specifically trained to extract text from German documents including invoices, receipts, forms, and other business documents.
German-OCR helps businesses and individuals extract important information from German-language documents like invoices, receipts, and forms. You input a German document (PDF or image) and get back the extracted text in various formats, including plain text, Markdown, or structured JSON. It's designed for anyone who regularly processes German documents and needs to quickly digitize or categorize their content.
Used by 1 other package. Available on PyPI.
Use this if you need to rapidly process and extract data from German business documents, whether you prefer cloud-based services or local, privacy-focused processing.
Not ideal if your documents are in languages other than German, or if you require optical character recognition for handwritten text exclusively.
Stars
75
Forks
5
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 03, 2026
Commits (30d)
0
Dependencies
3
Reverse dependents
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/Keyvanhardani/german-ocr"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
google/langextract
A Python library for extracting structured information from unstructured text using LLMs with...
Extralit/extralit
Fast and accurate systemic data extraction with LLM assistance
oidlabs-com/Lexoid
Multimodal document parser for high quality data understanding and extraction
xingbow/SciDaEx
Structured data extraction from research literature
parsee-ai/parsee-core
Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and...