MohamedHmini/iww
AI based web-wrapper for web-content-extraction
This project helps data professionals and researchers automatically extract specific content from websites. You provide a web page URL, and it identifies and pulls out structured data like product lists or main article text, which comes out as a JSON file. This is useful for anyone needing to gather information from many web pages for analysis without manual copy-pasting or complex coding.
102 stars. No commits in the last 6 months.
Use this if you need to reliably extract specific types of content, such as product listings or the primary article body, from various web pages for data collection or research.
Not ideal if you need a simple, visual point-and-click tool for occasional web scraping, as this requires some programming knowledge.
Stars
102
Forks
14
Language
Python
License
MIT
Category
Last pushed
Feb 06, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/MohamedHmini/iww"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...
plabayo/rama
modular service framework to move and transform network packets
scrapinghub/spidermon
Scrapy Extension for monitoring spiders execution.