scrapinghub/web-poet
Web scraping Page Objects core library
This library helps web scraping developers organize their code for extracting specific data from websites. It takes raw HTML content from a web page and, through structured code, outputs the desired data points like product names, prices, or article text. It's designed for Python developers who build and maintain web scrapers.
105 stars. Used by 1 other package. Available on PyPI.
Use this if you are a web scraping developer looking to make your parsing logic more maintainable, reusable, and testable across different web pages.
Not ideal if you are looking for a complete web scraping framework or a tool that handles fetching web pages, as this focuses specifically on the data extraction part.
Stars
105
Forks
18
Language
Python
License
BSD-3-Clause
Category
Last pushed
Apr 02, 2026
Commits (30d)
0
Dependencies
11
Reverse dependents
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/scrapinghub/web-poet"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...
plabayo/rama
modular service framework to move and transform network packets
scrapinghub/spidermon
Scrapy Extension for monitoring spiders execution.