ArchiveBox/abx-dl
⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git repos, and more...
This tool helps anyone who needs to fully capture a webpage or online content. You provide a URL, and it downloads everything associated with that page: HTML, images, videos, PDFs, article text, and even entire websites. It's ideal for researchers, journalists, or anyone building a personal archive of web content.
102 stars. Available on PyPI.
Use this if you need to reliably download all available content from a URL, including dynamic elements and media, for archiving, research, or offline access.
Not ideal if you only need a simple screenshot or specific text from a page and prefer a lightweight, single-purpose tool.
Stars
102
Forks
4
Language
Python
License
MIT
Category
Last pushed
Mar 27, 2026
Commits (30d)
0
Dependencies
11
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/ArchiveBox/abx-dl"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
seleniumbase/SeleniumBase
APIs for browser automation, testing, and bypassing bot-detection.
apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers....
intoli/user-agents
A JavaScript library for generating random user agents with data that's updated daily.
apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In...
Kaliiiiiiiiii-Vinyzu/patchright
Undetected version of the Playwright testing and automation library.