wszqkzqk/qt-web-extractor

Web content extraction engine backed by Qt WebEngine.

29
/ 100
Experimental

This tool helps people who need to gather up-to-date information from modern websites, especially those built with JavaScript frameworks, or extract content from PDF documents. It takes a web page URL or PDF document as input and provides clean, readable Markdown text or HTML, which is perfect for feeding into AI models or other data processing workflows. Anyone performing web research, content aggregation, or building AI agents that interact with web content would find this useful.

Use this if you need to reliably extract content from dynamic web pages that use JavaScript or require login, or if you need to pull text directly from PDF files for further analysis or AI processing.

Not ideal if you only need to fetch static HTML content without JavaScript rendering, or if you require full browser automation features like clicking buttons or filling forms.

web-scraping content-extraction AI-data-preparation market-intelligence research-automation
No Package No Dependents
Maintenance 13 / 25
Adoption 5 / 25
Maturity 11 / 25
Community 0 / 25

How are scores calculated?

Stars

11

Forks

Language

Python

License

GPL-3.0

Last pushed

Mar 27, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/perception/wszqkzqk/qt-web-extractor"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.