destilabs/webtric
Universal Python script to scrape many typical websites
This helps extract structured data from websites that display information in organized tables or grids, like product listings or data tables. It takes the web page as input and outputs the cleaned, extracted data, ready for analysis or storage. Anyone needing to collect information from many websites for tasks like competitive analysis or market research would find this useful.
No commits in the last 6 months.
Use this if you need to quickly gather structured data from multiple web pages that present information in repetitive, organized layouts.
Not ideal if you need to scrape data from websites with highly complex, dynamic interfaces or require deep, customized parsing beyond simple table or tile structures.
Stars
12
Forks
6
Language
Shell
License
—
Category
Last pushed
May 04, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/destilabs/webtric"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
seleniumbase/SeleniumBase
APIs for browser automation, testing, and bypassing bot-detection.
apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers....
intoli/user-agents
A JavaScript library for generating random user agents with data that's updated daily.
apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In...
Kaliiiiiiiiii-Vinyzu/patchright
Undetected version of the Playwright testing and automation library.