emadehsan/thal
Getting started with Puppeteer and Chrome Headless for Web Scraping
This project offers a starting guide for anyone looking to programmatically extract information from websites using automated browser actions. It shows how to use a headless Chrome browser to navigate web pages, log into accounts, fill out forms, and pull specific data like user emails. This is for market researchers, data analysts, or anyone who needs to collect public web data efficiently.
2,365 stars. No commits in the last 6 months.
Use this if you need to automate interactions with websites to collect data that's not easily available through public APIs, such as scraping specific details from user profiles after logging in.
Not ideal if you are looking for a simple, no-code web scraping solution or if your primary goal is basic data extraction from static pages without needing to simulate user behavior.
Stars
2,365
Forks
206
Language
JavaScript
License
MIT
Category
Last pushed
Oct 28, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/emadehsan/thal"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
seleniumbase/SeleniumBase
APIs for browser automation, testing, and bypassing bot-detection.
apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers....
intoli/user-agents
A JavaScript library for generating random user agents with data that's updated daily.
apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In...
Kaliiiiiiiiii-Vinyzu/patchright
Undetected version of the Playwright testing and automation library.