get-set-fetch/scraper

Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.

/ 100

Established

This tool helps you automatically collect information from websites, whether it's product details, public data, or other content. You provide it with a starting web address and tell it what specific pieces of information you want to extract, like headlines, prices, or links. It then delivers a structured dataset, often in a format like CSV. This is ideal for researchers, marketers, or data analysts who need to gather large amounts of publicly available web data.

113 stars. No commits in the last 6 months. Available on npm.

Use this if you need to systematically collect data from many web pages and store it in a structured format for analysis or further use.

Not ideal if you only need to extract data from a handful of pages or prefer a simple browser extension for occasional data grabs.

web-data-collection market-research content-aggregation competitive-intelligence data-acquisition

Stale 6m

Maintenance 0 / 25

Adoption 9 / 25

Maturity 25 / 25

Community 17 / 25

How are scores calculated?

Stars

113

Forks

Language

TypeScript

License

MIT

Featured in

Giving AI Agents Eyes: Browser Automation in 2026

Related tools

seleniumbase/SeleniumBase

APIs for browser automation, testing, and bypassing bot-detection.

apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers....

intoli/user-agents

A JavaScript library for generating random user agents with data that's updated daily.

apify/crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In...

Kaliiiiiiiiii-Vinyzu/patchright

Undetected version of the Playwright testing and automation library.

Explore Perception Tools

All categories Trending Perception directory Insights