havanagrawal/GoodreadsScraper
Scrape data from Goodreads using Scrapy and Selenium :books:
This tool helps you gather extensive book and author data from Goodreads without manual effort. You input specific Goodreads lists, authors, or your own reading shelves, and it outputs structured data files containing details like titles, descriptions, ratings, genres, and author information. It's designed for researchers, data analysts, or bibliophiles who want to analyze trends, build recommendation systems, or study literary patterns.
146 stars. No commits in the last 6 months.
Use this if you need to quickly collect a large dataset of book and author metadata from Goodreads for analysis, visualization, or personal projects.
Not ideal if you only need data for a few books or authors, or if you prefer a simple point-and-click interface rather than running scripts.
Stars
146
Forks
41
Language
Python
License
MIT
Category
Last pushed
May 25, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/havanagrawal/GoodreadsScraper"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
seleniumbase/SeleniumBase
APIs for browser automation, testing, and bypassing bot-detection.
apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers....
intoli/user-agents
A JavaScript library for generating random user agents with data that's updated daily.
apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In...
Kaliiiiiiiiii-Vinyzu/patchright
Undetected version of the Playwright testing and automation library.