vakra-dev/reader

Open-source, production-grade web scraping engine built for LLMs. Scrape and crawl the entire web, clean markdown, ready for your agents.

/ 100

Established

This project helps developers gather clean, structured web content for AI models and agents. It takes website URLs or entire domains as input, intelligently navigates complex sites, bypasses common anti-bot measures, and outputs cleaned content in markdown or HTML. It's designed for developers building applications that need reliable, large-scale web data.

474 stars. Available on npm.

Use this if you are a developer building AI agents or applications that need to consistently and reliably extract clean text or HTML from many websites, even those with anti-bot protections.

Not ideal if you need a simple, one-off web scraper for personal use or if you are not comfortable working with a command-line interface or programming API.

AI development web data collection agent training data content extraction data pipeline

Maintenance 10 / 25

Adoption 10 / 25

Maturity 22 / 25

Community 13 / 25

How are scores calculated?

Stars

474

Forks

Language

TypeScript

License

Apache-2.0

Compare

reader and teracrawl reader and scraping-agent-ai reader and just-scrape

Related agents

joaobenedetmachado/scrapit

A (really) easy way to web scrape

firecrawl/open-scouts

🔥 AI-powered web monitoring platform. Create automated scouts that search the web and send email...

BrowserCash/teracrawl

High-performance web crawler API optimized for LLMs. Turn any search or website into clean...

memvid/maw

Crawl any website into a single searchable file. Query it forever, offline.

poneoneo/Alibaba-CLI-Scraper

Create your own Alibaba dataset and interact with it in plain English.

Explore AI Agents

All categories Trending AI Agent directory Insights