NoxelS/openai-scraper
This is a template repository for building a web scraper with OpenAI support. The repository provides a basic project structure with TypeScript and Puppeteer pre-configured, as well as OpenAI's GPT-3 API integration. With this template, you can easily build a scraper that uses machine learning to analyze and extract insights from the scraped data.
This project helps developers build robust web scrapers. It takes a website URL as input and extracts data, which can then be stored in a MySQL database. It's used by software developers or data engineers who need to programmatically collect information from websites for various applications.
No commits in the last 6 months.
Use this if you are a developer looking for a comprehensive template to create a web scraper that includes scheduling, advanced browser control, and database integration.
Not ideal if you are a non-developer seeking a no-code solution for simple data extraction, as this requires coding knowledge to set up and customize.
Stars
27
Forks
2
Language
TypeScript
License
—
Category
Last pushed
Jan 29, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/NoxelS/openai-scraper"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
jamesturk/scrapeghost
👻 Experimental library for scraping websites using OpenAI's GPT API.
Priyanshu-hawk/ChatGPT-unofficial-api-selenium
This is unofficial ChatGPT API using selenium for prompt testing and flow testing purposes
3281448091/easyChatGPT
An unofficial yet elegant interface of the ChatGPT API using browser automation that bypasses...
ryuseisan/auto-chatgpt
Automate interaction with the browser version of ChatGPT.
Ryaang/gpt-web-crawler
A web crawler for GPTs to build knowledge bases 用于GPT构建知识库的网站爬虫