pourmand1376/PersianCrawler
Open source crawler for Persian websites.
This tool helps researchers, data scientists, and language model developers gather large amounts of text content from Persian-language websites. You input the website you want to scrape, and it outputs structured text data suitable for analysis, training models, or building datasets. It's designed for anyone needing to collect substantial text from popular Persian news sites or Wikipedia.
No commits in the last 6 months.
Use this if you need to build custom datasets of Persian text for natural language processing, sentiment analysis, or academic research.
Not ideal if you only need a small amount of data, or if you're looking to scrape websites that are not in the provided list or require highly customized scraping logic beyond simple parameter changes.
Stars
20
Forks
4
Language
Python
License
MIT
Category
Last pushed
Aug 27, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/pourmand1376/PersianCrawler"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.