nirholas/extract-llms-docs

Extract documentation for AI agents from any site with llms.txt support. Features MCP server, REST API, batch processing, and multiple export formats.

49
/ 100
Emerging

This project helps AI agent builders and large language model (LLM) developers quickly get up-to-date, structured documentation from websites. It takes any website URL that uses the 'llms.txt' or 'install.md' standard and outputs organized, machine-readable documentation in formats like Markdown, JSON, or YAML. This is for anyone building, training, or fine-tuning AI agents and LLMs who needs high-quality, current data.

Available on npm.

Use this if you need to reliably extract, organize, and prepare website documentation for use with AI agents, LLMs, or automated documentation pipelines.

Not ideal if you are looking to extract general content from any website, as it specifically targets sites using the 'llms.txt' and 'install.md' standards.

AI agent development LLM training data documentation automation RAG systems AI assistant content
Maintenance 10 / 25
Adoption 5 / 25
Maturity 20 / 25
Community 14 / 25

How are scores calculated?

Stars

14

Forks

3

Language

TypeScript

License

MIT

Last pushed

Mar 03, 2026

Commits (30d)

0

Dependencies

17

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/mcp/nirholas/extract-llms-docs"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.