LLM Web Scraping LLM Tools

Tools for extracting and parsing structured data from websites using LLM-powered methods, including web crawlers, HTML extractors, and scraping APIs optimized for AI agent integration. Does NOT include general-purpose web scrapers without LLM integration, browser automation tools, or proxy/VPN services.

There are 31 llm web scraping tools tracked. 1 score above 50 (established tier). The highest-rated is carlosplanchon/spidercreator at 51/100 with 217 stars.

Get all 31 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-web-scraping&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 carlosplanchon/spidercreator

Automated web scraping spider generation using Browser Use and LLMs....

51
Established
2 raznem/parsera

Lightweight library for scraping web-sites with LLMs

48
Emerging
3 rednafi/html-to-text

Extract pure text from any webpage

44
Emerging
4 supadata-ai/js

Official TypeScript/JavaScript SDK for the Supadata API.

41
Emerging
5 yeahhe365/JustSearch

基于 Playwright 的自主 AI 搜索智能体。支持迭代式任务规划、深度网页爬取,以及带引用来源的多源知识整合。

39
Emerging
6 Riddhish1/CogniScrape

Intelligent Web Scraping Library with LLMs

36
Emerging
7 ElysiumOSS/enterprise-ai-recursive-web-scraper

AI assisted web scraper, w/ content summarization, screensshots, and filter 🤖🕷️

33
Emerging
8 rowyio/LLM-Web-Crawler

Web Scraper and Crawler for LLM Apps and AI Workflows with NoCode / LowCode....

32
Emerging
9 poodle64/supacrawl

Zero-infrastructure web scraping for the terminal

32
Emerging
10 cameronking4/nextjs-firecrawl-starter

Nextjs 15 Firecrawl app to scrape doc links for an LLM. Use it as a starter...

32
Emerging
11 sammcj/firecrawler

A lightweight frontend for self-hosted Firecrawl instances

31
Emerging
12 kubernetes-bad/metachar

Scraper for Chub.ai and JanitorAI.com

30
Emerging
13 firecrawl/firecrawl-py

Crawl and convert any website into clean markdown

30
Emerging
14 lee-lou2/distill

고성능 Rust 기반 웹 스크래퍼 & LLM 분석 API 서버

29
Experimental
15 cipher-rc5/fire_ctrl

Spec-compliant self-hosted Firecrawl v2 runtime in native Rust

28
Experimental
16 flyrank-bih/flyscrape

The Most Powerful Open-source LLM Friendly Typescript Web Crawler & Scraper

27
Experimental
17 plater7/docrawl

Web crawler para sitios de documentación — convierte páginas a Markdown...

25
Experimental
18 AndreaBozzo/Ares

Next-gen AI scraper — LLM-powered structured data extraction

24
Experimental
19 us/crw

⚡Lightweight Firecrawl alternative in Rust — 91.5% coverage, 5x faster, 3MB...

24
Experimental
20 TheFishPilot/Verity-Agentic-Web-Scraper

Verity API for verified web extraction in AI pipelines (Fastify +...

22
Experimental
21 ruchit-p/essence

A fast, open-source web retrieval engine built in Rust.

21
Experimental
22 greysquirr3l/stygian

High-performance graph-based web scraping engine + anti-detection browser...

21
Experimental
23 ChenTaHung/HTML-Text-Parser

This project is designed to extract text from documents and prepare it for...

20
Experimental
24 aglasencnik/Parsera.NET

A lightweight NuGet package for the Parsera API, designed to simplify...

19
Experimental
25 parsera-labs/parsera-ts

A Typesafe SDK for Scraping LLMs with Parsera.org and JavaScript

18
Experimental
26 davidyen1124/ai-crawler

AI web scraper using GPT to dynamically optimize CSS selectors for reliable...

18
Experimental
27 1amageek/Scouter

A Swift library for recursive web content searching and link extraction...

18
Experimental
28 mlibre/Clean-Web-Scraper

A Node.js web scraper that extracts clean, readable content from websites -...

17
Experimental
29 Pankaj3112/pluckr

Schema-first, self-healing HTML extraction powered by LLMs

17
Experimental
30 jenslys/skrape-js

TypeScript/Node.js SDK to easily interact with the skrape.ai API

11
Experimental
31 riancintiyo/llm-review-scraper

Create LLM Summary from Google Maps

10
Experimental