jamesturk/spatula

A modern Python library for writing maintainable web scrapers.

42
/ 100
Emerging

This is a Python library that helps developers write code to extract information from websites and other online documents. It takes in web pages (HTML, CSV, JSON, XML, PDF, Excel) and outputs structured data that can be stored and used. It's designed for software developers or data engineers who need to regularly collect data from the web in a robust and organized way.

250 stars.

Use this if you are a developer building a web scraper and want to create code that is easy to understand, maintain, and can handle various data formats beyond just HTML.

Not ideal if you are looking for a no-code solution or a simple command-line tool for one-off data extraction without writing Python code.

web-scraping data-extraction data-engineering developer-tool data-collection
No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 10 / 25

How are scores calculated?

Stars

250

Forks

12

Language

Python

License

MIT

Category

scraper

Last pushed

Nov 22, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/perception/jamesturk/spatula"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.