justmarkham/trump-lies
Tutorial: Web scraping in Python with Beautiful Soup
This project helps you take information published on a website, like a news article, and turn it into a structured dataset. It shows you how to programmatically extract specific pieces of text (like dates, quotes, and links) from a web page and save them into a file. Anyone who needs to collect and organize data from public websites for analysis or record-keeping would find this useful.
247 stars. No commits in the last 6 months.
Use this if you need to extract specific information from a static web page and store it in a clean, organized format.
Not ideal if you're dealing with very complex websites that require login, dynamic content loading, or have strong anti-scraping measures.
Stars
247
Forks
217
Language
Jupyter Notebook
License
—
Category
Last pushed
Nov 18, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/justmarkham/trump-lies"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...
plabayo/rama
modular service framework to move and transform network packets
scrapinghub/spidermon
Scrapy Extension for monitoring spiders execution.