code4craft/webmagic

A scalable web crawler framework for Java.

57
/ 100
Established

This framework helps developers quickly build custom web crawlers to collect specific data from websites. You define what information you need and from which pages, and it provides the tools to download, manage URLs, extract content, and save the data. It's designed for software developers who need to automate data collection from the web for various applications.

11,699 stars.

Use this if you are a Java developer needing to build a custom, scalable web crawler to systematically extract specific data from multiple web pages.

Not ideal if you need a no-code solution for web scraping or if your project isn't built with Java.

web-scraping data-acquisition web-automation java-development data-collection
No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 25 / 25

How are scores calculated?

Stars

11,699

Forks

4,149

Language

Java

License

Apache-2.0

Category

scraper

Last pushed

Dec 20, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/perception/code4craft/webmagic"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.