seagatesoft/webdext
Intelligent Web Data Extractor
This tool helps you quickly gather structured data from web pages that display lists, like product catalogs or search results. It takes a web page with multiple similar items and extracts key details for each, outputting them in an organized format. It's designed for anyone who needs to collect information efficiently from repetitive web page layouts, such as market researchers, data analysts, or competitive intelligence specialists.
No commits in the last 6 months.
Use this if you frequently need to extract consistent data records from web pages that show lists of items, like directory listings, job boards, or product search results.
Not ideal if you need to extract data from single, unstructured pages, or if you're not comfortable running a browser extension or injecting scripts.
Stars
74
Forks
16
Language
HTML
License
MIT
Category
Last pushed
Dec 05, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/seagatesoft/webdext"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...
plabayo/rama
modular service framework to move and transform network packets
scrapinghub/spidermon
Scrapy Extension for monitoring spiders execution.