ArchiveBox/abx-dl

⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git repos, and more...

/ 100

Established

This tool helps anyone who needs to fully capture a webpage or online content. You provide a URL, and it downloads everything associated with that page: HTML, images, videos, PDFs, article text, and even entire websites. It's ideal for researchers, journalists, or anyone building a personal archive of web content.

102 stars. Available on PyPI.

Use this if you need to reliably download all available content from a URL, including dynamic elements and media, for archiving, research, or offline access.

Not ideal if you only need a simple screenshot or specific text from a page and prefer a lightweight, single-purpose tool.

web-archiving digital-preservation online-research content-capture OSINT

Maintenance 13 / 25

Adoption 9 / 25

Maturity 25 / 25

Community 7 / 25

How are scores calculated?

Stars

102

Forks

Language

Python

License

MIT

Featured in

Giving AI Agents Eyes: Browser Automation in 2026

Related tools

seleniumbase/SeleniumBase

APIs for browser automation, testing, and bypassing bot-detection.

apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers....

intoli/user-agents

A JavaScript library for generating random user agents with data that's updated daily.

apify/crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In...

Kaliiiiiiiiii-Vinyzu/patchright

Undetected version of the Playwright testing and automation library.

Explore Perception Tools

All categories Trending Perception directory Insights