scrapehero/selectorlib

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

51
/ 100
Established

This tool helps developers automate the process of extracting specific information from web pages. You provide it with a web page's HTML content and a YAML file that defines what data you want to pull out (like titles or links) using CSS selectors or XPath. The output is structured data, such as a dictionary, containing the extracted information. This is ideal for developers building web scraping solutions or data collection tools.

Used by 1 other package. No commits in the last 6 months. Available on PyPI.

Use this if you are a developer who needs a structured and configurable way to define and extract data from HTML content within your Python applications.

Not ideal if you are not a developer and need a visual, no-code web scraping tool, or if you require advanced features like CAPTCHA solving, JavaScript rendering, or proxy management.

web-scraping data-extraction developer-tool automation
Stale 6m
Maintenance 0 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 16 / 25

How are scores calculated?

Stars

74

Forks

12

Language

HTML

License

MIT

Category

scraper

Last pushed

Jan 30, 2023

Commits (30d)

0

Dependencies

3

Reverse dependents

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/perception/scrapehero/selectorlib"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.