Web Scraping Tools ML Frameworks
Tools and frameworks for automatically extracting data from websites through web scraping, crawling, and HTML parsing. Does NOT include data cleaning libraries, NLP analysis tools, or downstream ML applications that use scraped data.
There are 39 web scraping tools frameworks tracked. 2 score above 50 (established tier). The highest-rated is alirezamika/autoscraper at 57/100 with 7,122 stars.
Get all 39 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=web-scraping-tools&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Framework | Score | Tier |
|---|---|---|---|
| 1 |
alirezamika/autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python |
|
Established |
| 2 |
YoongiKim/AutoCrawler
Google, Naver multiprocess image web crawler (Selenium) |
|
Established |
| 3 |
machine-learning-apps/Issue-Label-Bot
Code For The Issue Label Bot, an App that automatically labels issues using... |
|
Emerging |
| 4 |
nuhmanpk/Webtrench
A powerful and easy-to-use web scrapper for collecting data from the web.... |
|
Emerging |
| 5 |
lorey/mlscraper
🤖 Scrape data from HTML websites automatically by just providing examples |
|
Emerging |
| 6 |
shaohua0116/ICLR2020-OpenReviewData
Script that crawls meta data from ICLR OpenReview webpage. Tutorials on... |
|
Emerging |
| 7 |
tal95shah/OLX_Scraper
:radio: An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted... |
|
Emerging |
| 8 |
gridaco/figma-archives
Figma Files Scraper for Research & Studies |
|
Emerging |
| 9 |
garysieling/video-crawler
Crawl websites for videos from Youtube, Vimeo, Soundcloud, etc |
|
Emerging |
| 10 |
Tuhin-thinks/instagram-unfollower-tracker-meerkit
Analyze Instagram followers, find unfollowers, automate follow/unfollow, and... |
|
Emerging |
| 11 |
NYX-VORAX/lightning-image-scraper
⚡ Lightning-fast Python image scraper | Download 10K+ images/min from any... |
|
Emerging |
| 12 |
DevGlitch/botwizer
Social media AI bot using computer vision to imitate human behaviors. Final... |
|
Emerging |
| 13 |
ganeshkavhar/Web-Scraping-in-python
ganesh kavhar python project |
|
Emerging |
| 14 |
udit-git/Python-WebScraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python |
|
Experimental |
| 15 |
Tsujimar/tsuki-wscp
Web scraper for AI/ML training |
|
Experimental |
| 16 |
b1t0nese/MacLearn
Программа, которая за считанные минуты соберёт для вас качественный датасет... |
|
Experimental |
| 17 |
dpuentel/github-issues-labeller-cohere
This is a GitHub issue labeller. Insert the url of a repository and using... |
|
Experimental |
| 18 |
Eshtiaque/Multi-Agent-Instagram-bot
Instagram-bot |
|
Experimental |
| 19 |
OwenOrcan/YiraBot-Crawler
YiraBot: Simplifying Web Scraping for All. A user-friendly tool for... |
|
Experimental |
| 20 |
ismailazdad/stackoverflowTags
flask website that automatically assigns multiple relevant tags to a... |
|
Experimental |
| 21 |
zt8812/lightning-image-scraper
🖼️ Download thousands of images fast with asynchronous scraping and... |
|
Experimental |
| 22 |
YafetGetu/Data_scraper-from-jiji-ethiopia
A professional web scraping tool for extracting product listings from the... |
|
Experimental |
| 23 |
jigusp/urls-le
🔗 Extract thousands of URLs per second from various formats like HTML, JSON,... |
|
Experimental |
| 24 |
ArtificialOSS/WebCrawl
Crawls the web to generate a huge dataset for training |
|
Experimental |
| 25 |
BlazeInferno64/ScrapyPy
ScrapyPy is a free, open-source, and powerful web scraping tool that... |
|
Experimental |
| 26 |
MaximumOverflow/Philia
An easy to use imageboard scraper. |
|
Experimental |
| 27 |
Decodo/soundcloud-scraper
Scraper for SoundCloud that extracts audio metadata and download URLs using... |
|
Experimental |
| 28 |
Gulilil/nusava
Development of Social media bot in Instagram, Nusava. |
|
Experimental |
| 29 |
gabryelvieiramusico/instagram-content-intelligence-pro
📊 Transform Instagram content into actionable insights with AI-driven... |
|
Experimental |
| 30 |
gmk418/Python-web-scraping
🔍 Discover Python web scraping techniques, libraries, and examples to... |
|
Experimental |
| 31 |
bhavanaaroy-sketch/AI-Code-Complexity-Analyzer
AI-based tool to analyze code complexity using Python and Streamlit |
|
Experimental |
| 32 |
Mrsultan7890/crl
CRL Pure Python crawler The Semantic Web Crawler For AI & Security |
|
Experimental |
| 33 |
bright-data-de/web-scraping-for-machine-learning
Scrapen Sie Webdaten für maschinelles Lernen, richten Sie ETL-Pipelines ein... |
|
Experimental |
| 34 |
ozgesadet/silver-invention
AI based tender finding |
|
Experimental |
| 35 |
Epsilon-Ventures/document-similarity-frontend
Major Project Frontend |
|
Experimental |
| 36 |
mate3424/easy-zoot-data-scraper
🛍️ Scrape structured fashion product data effortlessly from multiple... |
|
Experimental |
| 37 |
basilcherian42/Insta-Insights
Insta-Insights: A powerful tool to identify fake accounts on Instagram and... |
|
Experimental |
| 38 |
Billie-LS/Scraping-ML-Deep
Scraping web for ML and Deep Learning applications |
|
Experimental |
| 39 |
memosasoft/wiki-nerd-1.0
Scraper for Wikipedia self learning project. I was interested on how we read... |
|
Experimental |