vakra-dev/reader

Open-source, production-grade web scraping engine built for LLMs. Scrape and crawl the entire web, clean markdown, ready for your agents.

55
/ 100
Established

This project helps developers gather clean, structured web content for AI models and agents. It takes website URLs or entire domains as input, intelligently navigates complex sites, bypasses common anti-bot measures, and outputs cleaned content in markdown or HTML. It's designed for developers building applications that need reliable, large-scale web data.

474 stars. Available on npm.

Use this if you are a developer building AI agents or applications that need to consistently and reliably extract clean text or HTML from many websites, even those with anti-bot protections.

Not ideal if you need a simple, one-off web scraper for personal use or if you are not comfortable working with a command-line interface or programming API.

AI development web data collection agent training data content extraction data pipeline
Maintenance 10 / 25
Adoption 10 / 25
Maturity 22 / 25
Community 13 / 25

How are scores calculated?

Stars

474

Forks

32

Language

TypeScript

License

Apache-2.0

Last pushed

Feb 02, 2026

Commits (30d)

0

Dependencies

9

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/agents/vakra-dev/reader"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.