PreferredAI/venom
Your preferred open source focused crawler for the deep web.
This tool helps developers efficiently collect specific information from the deep web by navigating and extracting data from various web pages. You provide it with starting web addresses and rules for what content to look for, and it returns the raw HTML pages and extracted data. It is primarily used by software developers building applications that require targeted web data collection.
No commits in the last 6 months.
Use this if you are a developer needing to programmatically collect structured data from many specific web pages on the deep web.
Not ideal if you are a non-developer seeking a point-and-click solution for general web scraping or surface web browsing.
Stars
75
Forks
5
Language
Java
License
Apache-2.0
Category
Last pushed
Jun 14, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/PreferredAI/venom"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...
plabayo/rama
modular service framework to move and transform network packets
scrapinghub/spidermon
Scrapy Extension for monitoring spiders execution.