danielnieto/scrapman
Retrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
This tool helps developers efficiently gather website content, even from pages that load dynamic information using JavaScript. You provide a list of URLs, and it returns the complete, rendered HTML content for each, just as a web browser would see it. It's designed for developers building applications that need to process or analyze web page content.
No commits in the last 6 months. Available on npm.
Use this if you need to programmatically fetch the fully-rendered HTML from many web pages quickly, especially those that rely heavily on JavaScript to display their content.
Not ideal if you need to interact with a web page like a user (e.g., clicking buttons, filling forms) beyond just retrieving its HTML, as this is not a browser automation tool.
Stars
22
Forks
3
Language
JavaScript
License
MIT
Category
Last pushed
Apr 13, 2018
Commits (30d)
0
Dependencies
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/danielnieto/scrapman"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...
plabayo/rama
modular service framework to move and transform network packets
scrapinghub/spidermon
Scrapy Extension for monitoring spiders execution.