Tiago-Lira/scrapyd-mongodb
Library designed to replace the SQLite backend by a MongoDB backend on Scrapy queue management
This helps Python developers who are running many web crawlers to manage their crawling tasks more effectively. It takes the instructions for your web crawlers and stores them in a MongoDB database, rather than a simpler file-based system. This allows developers to handle a much larger volume of crawling jobs and results, making their web scraping operations more robust and scalable.
No commits in the last 6 months. Available on PyPI.
Use this if you are a Python developer managing a large number of web crawling jobs with Scrapy and need a more scalable and robust way to manage your task queue and results than the default SQLite option.
Not ideal if you are not using Scrapy for web crawling, or if your web scraping needs are small and a simple, file-based queue management system is sufficient.
Stars
17
Forks
9
Language
Python
License
MIT
Category
Last pushed
Sep 02, 2017
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/Tiago-Lira/scrapyd-mongodb"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...
plabayo/rama
modular service framework to move and transform network packets
scrapinghub/spidermon
Scrapy Extension for monitoring spiders execution.