Intelligent Web Data Extraction AI Agents

Tools that use AI agents to automatically extract, parse, and structure data from websites through natural language instructions and intent-based scraping. Does NOT include general web crawlers, SEO audit platforms, lead database services, or non-agentic scraping libraries.

There are 49 intelligent web data extraction agents tracked. 2 score above 50 (established tier). The highest-rated is vakra-dev/reader at 55/100 with 474 stars.

Get all 49 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=agents&subcategory=intelligent-web-data-extraction&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Agent Score Tier
1 vakra-dev/reader

Open-source, production-grade web scraping engine built for LLMs. Scrape and...

55
Established
2 joaobenedetmachado/scrapit

A (really) easy way to web scrape

50
Established
3 firecrawl/open-scouts

🔥 AI-powered web monitoring platform. Create automated scouts that search...

47
Emerging
4 BrowserCash/teracrawl

High-performance web crawler API optimized for LLMs. Turn any search or...

45
Emerging
5 memvid/maw

Crawl any website into a single searchable file. Query it forever, offline.

41
Emerging
6 poneoneo/Alibaba-CLI-Scraper

Create your own Alibaba dataset and interact with it in plain English.

40
Emerging
7 jufeng-2022/mtywatch

一句话监控网页内容变化,AI | 爬虫 | 网页监控 | 网页更新提醒 | 网页内容订阅

38
Emerging
8 hmshb/scraping-agent-ai

AI-powered web scraping agent built with LangGraph, LangSmith, Firecrawl,...

37
Emerging
9 ma-pony/deepspider

智能爬虫工程平台 - 基于 DeepAgents + Patchright 的 AI 爬虫 Agent | Intelligent Web...

35
Emerging
10 kaymen99/ai-web-scraper

AI web scraper built with Crawl4AI for extracting structured leads data from...

31
Emerging
11 spider-rs/web-crawling-guides

How to guides on web-crawling or scraping

31
Emerging
12 Dieans/Universal-News-Scraper

🌍 Scrape and aggregate news effortlessly with Universal News Scraper, your...

30
Emerging
13 tinaponting/ai-robots-scrapers

AI robots.txt, AI scrapers block ai scrapers

30
Emerging
14 ScrapeGraphAI/ScrapeHubAI

🌟 AI-powered tool to analyze GitHub stargazers, identify companies, and...

29
Experimental
15 NickEinstein1/Scrapper-Enricher

Scrapping Agent - CrewAI

27
Experimental
16 oxylabs/ai-crawler-py

Crawl a website starting from a URL, find relevant pages, and extract data –...

27
Experimental
17 ScrapeGraphAI/just-scrape

CLI for AI-powered web scraping, data extraction, search, and crawling ...

26
Experimental
18 1nn0k3sh4/trendevourer

Trend Devourer 👗✨ AI-Powered Visual Style Analyst

26
Experimental
19 Chaitya44/AI-WebScraper

An intelligent, universal web scraper powered by Google Gemini AI. Features...

24
Experimental
20 phia-francis/nesta-signal-scout

An AI-powered foresight agent for Nesta's Discovery Hub. Signal Scout...

24
Experimental
21 brightdata/trendscan

TrendScan is a multi-source company intelligence platform for automated...

24
Experimental
22 isweerasingha/Auditeo-AI

An enterprise-grade, agentic website audit engine powered by GPT-5.4 and...

23
Experimental
23 Musubi-ai/Musubi

Musubi: A convenient crawling tool for collecting web text data in Python.

23
Experimental
24 Kaus-code/Neuroscout-oss

An autonomous AI agent powered by Gemini 2.5 Flash that scouts GitHub for...

22
Experimental
25 rbhatia1997/artist-scout

Open-source AI A&R toolkit for artist scouting, shortlist building, and...

22
Experimental
26 lout33/scout-oss

Local web research agent and mission-driven intelligence scanner that writes...

22
Experimental
27 breezy89757/AgentScraper

AgentScraper: AI-Powered Web Scraper (v1.0) with Visual Extraction

21
Experimental
28 oxylabs/ai-scraper-py

AI Scraper is a powerful scraping tool and scrape agent built to automate...

21
Experimental
29 sirToby99/swipenode

Lightning-fast, zero-render web extraction CLI built for AI agents. Extracts...

21
Experimental
30 breezy89757/SmartScraper

🤖 AI-Powered Web Scraper Generator - Turns URLs into Python code with...

20
Experimental
31 musadiq7860/AI_growth_auditor

AI-powered business growth audit tool — scrapes website, generates custom...

17
Experimental
32 FlowExtractAPI/ai-lead-extractor

Extract any information from websites using intelligent AI - from contact...

17
Experimental
33 rosasbehoundja/tech-trends-monitor

Automated RSS flux monitoring system

17
Experimental
34 Tomefy5/scout-agent

Autonomous AI Agent for B2B Lead Generation & Enrichment

16
Experimental
35 smoothemerson/scout

AI-powered multi-agent system that analyzes your GitHub profile and CV to...

14
Experimental
36 Ascentia-Sandbox/StartInsight

Daily automated startup intelligence: 6 scrapers (Reddit/HN/PH/Trends/X) → 8...

14
Experimental
37 stell619/scraper-agent

AI-powered research agent — scrapes YouTube, Etsy, crypto, stocks & trends...

14
Experimental
38 afrexai-cto/ai-ops-audit

Free AI operations audit checklist for mid-market companies. Score your...

13
Experimental
39 BraaMohammed/microwave-ai

Microwave AI is a chat-based AI agent for vibe data enrichment. Upload a...

13
Experimental
40 nomiS0614/mtywatch

📧 Monitor webpage content with AI and receive real-time updates on topics...

13
Experimental
41 areebahmeddd/crawl4ai-agent

Web Crawler Agent for Indexing Technical Documentation with Vector Search

13
Experimental
42 brightdata/brand-reputation-monitor

AI-powered brand monitoring workflow using Bright Data SDK, OpenAI &...

13
Experimental
43 vinay-852/AI-Agent-for-Sheets

The primary objective of this project is to harness Google’s Generative AI...

13
Experimental
44 Atqiyanabila01/AI-Lead-Scout

An AI-powered web research agent that crawls company data and generates...

13
Experimental
45 Hirsun/Website-Crawler

一个为AI Agent设计的HTML网页爬取服务,能够高效获取网页内容并进行清洗处理。

13
Experimental
46 itallstartedwithaidea/google-ai-agent-audit-engine

AI-powered Google Ads audit engine — automated account analysis, scoring,...

12
Experimental
47 brunosergi/ai-scraping-kit

Complete self-hosted stack for building AI-powered web scraping automation...

11
Experimental
48 terencicp/ai-web-scraper

Prototype for a semi-autonomous web scraper that extracts lists of objects.

11
Experimental
49 dotabdullah/AuditIQ-SEO-Audit-Tool

We’ve built a smart automation system that analyzes your website’s SEO...

10
Experimental