KadekM/scrawler
Scala web crawling and scraping using fs2 streams
This project helps developers gather data from websites by defining how to navigate web pages and what information to extract. You provide a starting URL and rules for which links to follow and what data points (like text or URLs) to collect. It then outputs the extracted data, which can be further processed or stored.
No commits in the last 6 months.
Use this if you are a Scala developer building an application that needs to automatically collect specific content or follow links from various websites in a structured and efficient way.
Not ideal if you need a no-code solution or a tool with a graphical user interface for web scraping, as this is a library for Scala programmers.
Stars
16
Forks
3
Language
HTML
License
MIT
Category
Last pushed
Aug 29, 2017
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/KadekM/scrawler"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Altimis/Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...
lexiforest/curl_cffi
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...
plabayo/rama
modular service framework to move and transform network packets
scrapinghub/spidermon
Scrapy Extension for monitoring spiders execution.