reader and just-scrape

These are competitors offering similar core functionality—both provide LLM-optimized web scraping with markdown output and crawling capabilities—though vakra-dev/reader is a self-hosted engine while ScrapeGraphAI/just-scrape is an API client wrapper, making them alternative architectural approaches to the same problem.

reader
55
Established
just-scrape
26
Experimental
Maintenance 10/25
Adoption 10/25
Maturity 22/25
Community 13/25
Maintenance 10/25
Adoption 5/25
Maturity 11/25
Community 0/25
Stars: 474
Forks: 32
Downloads:
Commits (30d): 0
Language: TypeScript
License: Apache-2.0
Stars: 10
Forks:
Downloads:
Commits (30d): 0
Language: TypeScript
License: MIT
No risk flags
No Package No Dependents

About reader

vakra-dev/reader

Open-source, production-grade web scraping engine built for LLMs. Scrape and crawl the entire web, clean markdown, ready for your agents.

This project helps developers gather clean, structured web content for AI models and agents. It takes website URLs or entire domains as input, intelligently navigates complex sites, bypasses common anti-bot measures, and outputs cleaned content in markdown or HTML. It's designed for developers building applications that need reliable, large-scale web data.

AI development web data collection agent training data content extraction data pipeline

About just-scrape

ScrapeGraphAI/just-scrape

CLI for AI-powered web scraping, data extraction, search, and crawling powered by the ScrapeGraph AI API. Supports smart scraping, agentic browser automation, markdownify, sitemap discovery, and JSON mode for piping to AI agents.

This tool helps you quickly gather information from websites by intelligently extracting specific data or entire articles in a clean format. You provide a web address or a search query along with instructions for what to find, and it returns the desired information, optionally structured as data or plain text. It's designed for anyone who needs to collect data from the web for research, content analysis, or business intelligence.

web-data-extraction market-research content-analysis competitive-intelligence lead-generation

Scores updated daily from GitHub, PyPI, and npm data. How scores work