pourmand1376/PersianCrawler

Open source crawler for Persian websites.

37
/ 100
Emerging

This tool helps researchers, data scientists, and language model developers gather large amounts of text content from Persian-language websites. You input the website you want to scrape, and it outputs structured text data suitable for analysis, training models, or building datasets. It's designed for anyone needing to collect substantial text from popular Persian news sites or Wikipedia.

No commits in the last 6 months.

Use this if you need to build custom datasets of Persian text for natural language processing, sentiment analysis, or academic research.

Not ideal if you only need a small amount of data, or if you're looking to scrape websites that are not in the provided list or require highly customized scraping logic beyond simple parameter changes.

Persian-language-research text-data-collection NLP-dataset-creation web-scraping content-acquisition
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

20

Forks

4

Language

Python

License

MIT

Last pushed

Aug 27, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/pourmand1376/PersianCrawler"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.