Tsujimar/tsuki-wscp

Web scraper for AI/ML training

/ 100

Experimental

This tool helps AI/ML practitioners gather large datasets from social media platforms like 4Chan, Reddit, and Twitter. You input your desired sources and it extracts posts or messages, storing them directly into your PostgreSQL database. It's designed for data scientists, machine learning engineers, and researchers who need extensive social media text for training their models.

No commits in the last 6 months.

Use this if you need to rapidly collect high volumes of social media text data from specific platforms to train your AI or machine learning models.

Not ideal if you need to scrape data from websites other than the supported social media platforms, or if you prefer a tool with a graphical interface.

AI-dataset-collection social-media-intelligence machine-learning-data natural-language-processing data-acquisition

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

alirezamika/autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

YoongiKim/AutoCrawler

Google, Naver multiprocess image web crawler (Selenium)

machine-learning-apps/Issue-Label-Bot

Code For The Issue Label Bot, an App that automatically labels issues using machine learning,...

nuhmanpk/Webtrench

A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of...

lorey/mlscraper

🤖 Scrape data from HTML websites automatically by just providing examples

Explore ML Frameworks

All categories Trending ML Framework directory Insights