sethupavan12/Markdownify

Convert documents, images to high-quality Markdown using Vision LLMs. Built for RAG ingestion pipelines.

/ 100

Emerging

This tool helps convert various documents, like PDFs, images, and DOCX files, into high-quality Markdown format. It intelligently extracts text, images, tables, and charts, preserving the document's structure, so you get a clean, organized Markdown output. It's ideal for anyone who needs to transform unstructured or visual document content into a machine-readable, editable format for tasks like building knowledge bases or preparing content for AI systems.

Use this if you need to convert scanned documents, reports, or images containing complex layouts into structured Markdown, including tables as Markdown tables and charts as Mermaid diagrams.

Not ideal if you only need plain text extraction without preserving document structure or converting visual elements to structured Markdown.

document-conversion knowledge-base-creation content-extraction data-preparation information-management

No Package No Dependents

Maintenance 6 / 25

Adoption 6 / 25

Maturity 15 / 25

Community 4 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

any4ai/AnyCrawl

AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts...

kreuzberg-dev/html-to-markdown

High performance and CommonMark compliant HTML to Markdown converter. Maintained by the...

ScrapeGraphAI/Scrapegraph-ai

Python scraper based on AI

adbar/trafilatura

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping,...

paulpierre/markdown-crawler

A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file...

Explore RAG Tools

All categories Trending RAG directory Insights