gustavovalverde/h2m-parser

Fast HTML to Markdown converter with Mozilla Readability extraction, streaming renderer, and LLM-ready output. 4x times faster than famous alternatives

39
/ 100
Emerging

This tool helps content managers, researchers, and data analysts quickly extract the main article content from any webpage and convert it into clean, structured Markdown. It takes raw HTML from a web page and outputs a refined Markdown document, optionally including important metadata and content suitable for further analysis or integration with AI tools. The typical user needs to process web articles for various applications, like building knowledge bases or training language models.

No commits in the last 6 months. Available on npm.

Use this if you need a fast and reliable way to convert website articles into clean Markdown, especially for large volumes or for use with AI systems.

Not ideal if you only need to convert simple HTML snippets without the need for article extraction or advanced post-processing.

content-extraction web-scraping knowledge-management data-preparation article-analysis
Stale 6m
Maintenance 2 / 25
Adoption 5 / 25
Maturity 24 / 25
Community 8 / 25

How are scores calculated?

Stars

9

Forks

1

Language

TypeScript

License

MIT

Last pushed

Oct 06, 2025

Commits (30d)

0

Dependencies

5

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/gustavovalverde/h2m-parser"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.