KashmalaJamshaid/Web-scraping-using-python-and-beautifulsoup
This notebook includes data scraping. For this beautifulsoup and selinium is used. It takes a website URL as an input and extracts the information listed below as an output from that webpage. For this beautifulsoup and selinium is used 1. Specific HTML tags along with titles and meta description 2. Extract specific tags, heading tags from h1-h6 along with titles and meta description 3. Extracting ALT tags 4. For counting words inside a web page 5. Inspection of broken links inside a webpage 6. Extracting the source code of the webpage
This tool helps marketing specialists, SEO analysts, and web content managers quickly gather detailed information from any webpage. You input a website URL and receive a structured output including specific HTML tags, titles, meta descriptions, image alt tags, word counts, and a report on broken links. It simplifies auditing web content and technical SEO aspects.
No commits in the last 6 months.
Use this if you need to quickly extract specific content elements, check for broken links, or audit SEO-critical information like titles and meta descriptions from web pages.
Not ideal if you need to interact with dynamic website elements that require complex user authentication or form submissions beyond basic data extraction.
Stars
10
Forks
9
Language
Jupyter Notebook
License
—
Category
Last pushed
Aug 04, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/KashmalaJamshaid/Web-scraping-using-python-and-beautifulsoup"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...
plabayo/rama
modular service framework to move and transform network packets
scrapinghub/spidermon
Scrapy Extension for monitoring spiders execution.