hashangit/Extract2MD
Extract2MD is a powerful and versatile AI-enabled client-side JavaScript library for extracting text from PDF files and converting it into Markdown.
Need to turn PDFs into clean, structured Markdown for your notes, documentation, or content? This tool takes your PDF files, whether they have selectable text or are scanned images, and converts them into well-formatted Markdown. It’s ideal for technical writers, researchers, or content creators who frequently work with documents and need to extract their content into a flexible, plain-text format for editing or publishing.
105 stars. Available on npm.
Use this if you need to reliably convert a wide variety of PDF documents, including those with complex layouts or scanned content, into structured Markdown, optionally enhanced by AI for better readability and organization.
Not ideal if you need a desktop application or a server-side solution, as this is designed for client-side web browser environments.
Stars
105
Forks
6
Language
JavaScript
License
MIT
Category
Last pushed
Feb 07, 2026
Commits (30d)
0
Dependencies
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/hashangit/Extract2MD"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
NanoNets/docstrange
Extract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple...
th1nhhdk/local_ai_ocr
An local, offline (after initial setup), portable OCR software that can process images and PDF...
Dicklesworthstone/llm_aided_ocr
Enhances Tesseract OCR output using LLMs (local or API) for error correction, smart chunking,...
emcf/thepipe
Get clean data from tricky documents, powered by vision-language models ⚡
langstruct-ai/langstruct
Extract structured data from any content using LLMs.