ispras/dedoc
Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser
656 stars.
Stars
656
Forks
51
Language
Python
License
Apache-2.0
Last pushed
Apr 07, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/document-ai/ispras/dedoc"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
opendatalab/MinerU
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
mehmet-kozan/pdf-parse
Pure TypeScript, cross-platform module for extracting text, images, and tabular data from PDFs....
HIllya51/LunaTranslator
视觉小说翻译器 / Visual Novel Translator
ShareX/ShareX
ShareX is a free and open-source application that enables users to capture or record any area of...
btwld/docling-sdk
A TypeScript SDK for Docling - Bridge between the Python Docling ecosystem and JavaScript/TypeScript.