kreuzberg-dev/html-to-markdown
High performance and CommonMark compliant HTML to Markdown converter. Maintained by the Kreuzberg team. Kreuzberg is a fast, polyglot document intelligence engine with a Rust core. It extracts structured data from 56+ document formats using streaming parsers and built-in OCR.
This tool helps developers transform web page content or other HTML snippets into clean, readable Markdown format. You provide it with raw HTML, and it outputs well-structured Markdown, along with extracted metadata like titles, links, and tables. It's designed for developers building applications that process or display web content, ensuring consistent conversion across various programming languages.
565 stars. Actively maintained with 158 commits in the last 30 days.
Use this if you need to reliably convert HTML content into Markdown for storage, display, or further processing within a software application, especially across different programming languages.
Not ideal if you are an end-user needing a simple drag-and-drop tool for occasional personal HTML to Markdown conversion without programming.
Stars
565
Forks
50
Language
HTML
License
MIT
Category
Last pushed
Mar 13, 2026
Commits (30d)
158
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/kreuzberg-dev/html-to-markdown"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
any4ai/AnyCrawl
AnyCrawl π: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts...
ScrapeGraphAI/Scrapegraph-ai
Python scraper based on AI
adbar/trafilatura
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping,...
paulpierre/markdown-crawler
A multithreaded πΈοΈ web crawler that recursively crawls a website and creates a π½ markdown file...
lightfeed/extractor
Using LLMs and AI browser automation to robustly extract web data