tomcardoso/intro-to-scraping

An introduction to web and document scraping

/ 100

Experimental

This project helps researchers, analysts, or anyone needing to gather information efficiently by teaching how to automate data collection from websites and documents like PDFs. You'll learn to transform unstructured web pages or offline files into organized, usable datasets without manual entry. It's designed for individuals who need to build their own data sources for analysis.

No commits in the last 6 months.

Use this if you spend countless hours manually extracting data from websites or documents and want to automate this tedious process.

Not ideal if you already have access to well-structured databases or APIs that provide all the data you need directly.

data-collection market-research competitor-analysis information-gathering report-generation

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

CSS

License

—

Featured in

Giving AI Agents Eyes: Browser Automation in 2026

Higher-rated alternatives

scrapy/scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Altimis/Scweet

A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...

lexiforest/curl_cffi

Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...

plabayo/rama

modular service framework to move and transform network packets

scrapinghub/spidermon

Scrapy Extension for monitoring spiders execution.

Explore Perception Tools

All categories Trending Perception directory Insights