gidlov/copycat
A PHP Scraping Class
This is a PHP class that helps developers extract specific pieces of information from web pages, even across tens of thousands of pages. It takes URLs (or search engine queries) and regular expressions as input, then outputs structured text data and can download associated files like images. It's designed for PHP developers who need to programmatically collect data from public websites.
No commits in the last 6 months.
Use this if you are a PHP developer needing to programmatically scrape specific data points from websites, download files from those sites, or find relevant pages using a search engine.
Not ideal if you need a simpler, less code-intensive solution for web scraping or if you are working in a language other than PHP.
Stars
73
Forks
13
Language
PHP
License
LGPL-3.0
Category
Last pushed
Sep 03, 2017
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/gidlov/copycat"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...
plabayo/rama
modular service framework to move and transform network packets
scrapinghub/spidermon
Scrapy Extension for monitoring spiders execution.