Yyalexx/scraping-to-postgresql-data-base
Многопоточный автоматический парсинг сайта и занесение данных в базу PostgreSQL (selenium, concurrent, beautifulsoup, bleach)
This project helps culinary app developers or food bloggers automatically gather recipe information from websites. It takes URLs of recipe pages as input and outputs a structured PostgreSQL database filled with recipe details like ingredients and instructions. It's designed for anyone building a recipe-focused application or content platform who needs a large, organized dataset of recipes.
No commits in the last 6 months.
Use this if you need to quickly populate a PostgreSQL database with structured recipe data scraped from various websites for a new food application or recipe collection.
Not ideal if you need to scrape data from websites that require complex authentication, have extremely varied layouts, or if you're looking for a general-purpose web scraping tool for non-recipe data.
Stars
8
Forks
1
Language
Jupyter Notebook
License
—
Category
Last pushed
Mar 09, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/Yyalexx/scraping-to-postgresql-data-base"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
seleniumbase/SeleniumBase
APIs for browser automation, testing, and bypassing bot-detection.
apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers....
intoli/user-agents
A JavaScript library for generating random user agents with data that's updated daily.
apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In...
Kaliiiiiiiiii-Vinyzu/patchright
Undetected version of the Playwright testing and automation library.