hrbrmstr/wayback
:rewind: Tools to Work with the Various Internet Archive Wayback Machine APIs
This project helps researchers and data analysts explore historical versions of websites and digital objects. You provide a URL or an Internet Archive identifier, and it returns information about cached versions, including their availability, timestamps, and even the full content of archived pages. This is perfect for anyone needing to track changes on websites over time, verify past content, or analyze historical web data.
No commits in the last 6 months.
Use this if you need to programmatically access and analyze past versions of web pages or other digital assets stored in the Internet Archive.
Not ideal if you only need to manually browse a single archived page, as the Internet Archive's website interface is sufficient for that purpose.
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/hrbrmstr/wayback"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...
plabayo/rama
modular service framework to move and transform network packets
scrapinghub/spidermon
Scrapy Extension for monitoring spiders execution.