lorey/mlscraper

🤖 Scrape data from HTML websites automatically by just providing examples

/ 100

Emerging

This tool helps you automatically extract specific pieces of information from websites, like product details, author names, or dates. You show it a few examples of the data you want to capture from an HTML page, and it learns how to find similar information on other pages. This is ideal for anyone who needs to collect structured data from many web pages without manually writing complex extraction rules.

1,379 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to gather specific, structured data from multiple web pages and want to avoid the tedious process of defining exact locations or selectors for each piece of information.

Not ideal if you only need to extract data from a single web page or if your data extraction needs are highly dynamic and require real-time, complex decision-making beyond pattern recognition.

web-scraping data-extraction market-research content-aggregation competitive-analysis

No License Stale 6m

Maintenance 0 / 25

Adoption 10 / 25

Maturity 17 / 25

Community 17 / 25

How are scores calculated?

Stars

1,379

Forks

Language

Python

License

—

Higher-rated alternatives

alirezamika/autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

YoongiKim/AutoCrawler

Google, Naver multiprocess image web crawler (Selenium)

machine-learning-apps/Issue-Label-Bot

Code For The Issue Label Bot, an App that automatically labels issues using machine learning,...

nuhmanpk/Webtrench

A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of...

shaohua0116/ICLR2020-OpenReviewData

Script that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using...

Explore ML Frameworks

All categories Trending ML Framework directory Insights