lorey/mlscraper
🤖 Scrape data from HTML websites automatically by just providing examples
This tool helps you automatically extract specific pieces of information from websites, like product details, author names, or dates. You show it a few examples of the data you want to capture from an HTML page, and it learns how to find similar information on other pages. This is ideal for anyone who needs to collect structured data from many web pages without manually writing complex extraction rules.
1,379 stars. No commits in the last 6 months. Available on PyPI.
Use this if you need to gather specific, structured data from multiple web pages and want to avoid the tedious process of defining exact locations or selectors for each piece of information.
Not ideal if you only need to extract data from a single web page or if your data extraction needs are highly dynamic and require real-time, complex decision-making beyond pattern recognition.
Stars
1,379
Forks
93
Language
Python
License
—
Category
Last pushed
Mar 17, 2024
Commits (30d)
0
Dependencies
6
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/lorey/mlscraper"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
alirezamika/autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
YoongiKim/AutoCrawler
Google, Naver multiprocess image web crawler (Selenium)
machine-learning-apps/Issue-Label-Bot
Code For The Issue Label Bot, an App that automatically labels issues using machine learning,...
nuhmanpk/Webtrench
A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of...
shaohua0116/ICLR2020-OpenReviewData
Script that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using...