bytewife/substack_scraper

A scraper for Substack article text content

38
/ 100
Emerging

This tool helps you gather public article content from multiple Substack blogs. You provide the URLs of the Substack blogs, and it downloads their posts as individual text files. This is useful for researchers, content analysts, or anyone looking to collect public Substack content for analysis or training data.

No commits in the last 6 months.

Use this if you need to quickly extract all publicly available text content from several Substack newsletters into a collection of simple text files.

Not ideal if you need to access content from subscriber-only articles, as it will only capture the publicly visible, truncated portions.

content-analysis research-data-collection newsletter-archiving text-corpus-creation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

32

Forks

6

Language

Rust

License

MIT

Category

scraper

Last pushed

Oct 05, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/perception/bytewife/substack_scraper"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.