jamesturk/spatula

A modern Python library for writing maintainable web scrapers.

/ 100

Emerging

This is a Python library that helps developers write code to extract information from websites and other online documents. It takes in web pages (HTML, CSV, JSON, XML, PDF, Excel) and outputs structured data that can be stored and used. It's designed for software developers or data engineers who need to regularly collect data from the web in a robust and organized way.

250 stars.

Use this if you are a developer building a web scraper and want to create code that is easy to understand, maintain, and can handle various data formats beyond just HTML.

Not ideal if you are looking for a no-code solution or a simple command-line tool for one-off data extraction without writing Python code.

web-scraping data-extraction data-engineering developer-tool data-collection

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 10 / 25

How are scores calculated?

Stars

250

Forks

Language

Python

License

MIT

Featured in

Giving AI Agents Eyes: Browser Automation in 2026

Higher-rated alternatives

scrapy/scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Altimis/Scweet

A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...

lexiforest/curl_cffi

Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...

plabayo/rama

modular service framework to move and transform network packets

scrapinghub/spidermon

Scrapy Extension for monitoring spiders execution.

Explore Perception Tools

All categories Trending Perception directory Insights