Wittline/data-engineering-challenge-th

Dockerizing a Python Script for Web Scraping and consume the scraped data using FastApi (www.metroscubicos.com)

32
/ 100
Emerging

This project helps developers quickly set up an isolated environment to collect data from websites and make it accessible via a web service. It takes a Python script for web scraping and integrates it with a data storage and an API, all within a single Docker container. A developer or data engineer would use this to efficiently deploy and serve scraped data.

No commits in the last 6 months.

Use this if you need a contained solution to scrape data from a website, store it in a local database, and then serve that data through an API endpoint.

Not ideal if you need a complex, scalable data pipeline or a solution for highly dynamic or anti-scraping protected websites.

web-scraping data-acquisition API-development dockerization data-serving
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 10 / 25

How are scores calculated?

Stars

15

Forks

2

Language

Python

License

Apache-2.0

Category

scraper

Last pushed

Dec 16, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/perception/Wittline/data-engineering-challenge-th"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.