crwlrsoft/robots-txt

Robots Exclusion Standard/Protocol Parser for Web Crawling/Scraping

40
/ 100
Emerging

When building a web crawler or scraper, this helps you interpret a website's robots.txt file. You feed it the rules from a website's robots.txt and your crawler's identifier, and it tells you which parts of the site your crawler is allowed to visit. Web scraping developers and data engineers use this to ensure their crawlers respect website access policies.

Use this if you are programming a web crawler and need to automatically determine if your crawler is permitted to access specific web pages or directories.

Not ideal if you are manually checking robots.txt files or need a tool for general web browsing, as this is specifically for automated crawler programs.

web-scraping web-crawling data-extraction bot-development web-automation
No Package No Dependents
Maintenance 6 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 13 / 25

How are scores calculated?

Stars

10

Forks

2

Language

PHP

License

MIT

Category

scraper

Last pushed

Jan 06, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/perception/crwlrsoft/robots-txt"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.