gonzalezcortes/scraping_news_articles

Python Scripts for Academic Web Scraping of WSJ Articles: Database Setup, Crawl, and Scrape.

34
/ 100
Emerging

This tool helps academic researchers gather specific news articles from the Wall Street Journal for their studies. It takes a target year and WSJ website pages as input, then extracts article links, headlines, publication times, and full text. The output is a structured SQLite database containing the scraped content and metadata, ready for research analysis.

No commits in the last 6 months.

Use this if you are an academic researcher who needs to systematically collect and organize Wall Street Journal articles for a specific research project.

Not ideal if you need to scrape data from websites other than the Wall Street Journal or require a solution for non-academic, commercial purposes.

academic-research news-analysis data-collection media-studies social-sciences
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 13 / 25

How are scores calculated?

Stars

9

Forks

2

Language

Python

License

MIT

Category

scraper

Last pushed

Oct 12, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/perception/gonzalezcortes/scraping_news_articles"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.