Wittline/data-engineering-challenge-th

Dockerizing a Python Script for Web Scraping and consume the scraped data using FastApi (www.metroscubicos.com)

/ 100

Emerging

This project helps developers quickly set up an isolated environment to collect data from websites and make it accessible via a web service. It takes a Python script for web scraping and integrates it with a data storage and an API, all within a single Docker container. A developer or data engineer would use this to efficiently deploy and serve scraped data.

No commits in the last 6 months.

Use this if you need a contained solution to scrape data from a website, store it in a local database, and then serve that data through an API endpoint.

Not ideal if you need a complex, scalable data pipeline or a solution for highly dynamic or anti-scraping protected websites.

web-scraping data-acquisition API-development dockerization data-serving

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 10 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Featured in

Giving AI Agents Eyes: Browser Automation in 2026

Higher-rated alternatives

scrapy/scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Altimis/Scweet

A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers,...

lexiforest/curl_cffi

Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser...

plabayo/rama

modular service framework to move and transform network packets

scrapinghub/spidermon

Scrapy Extension for monitoring spiders execution.

Explore Perception Tools

All categories Trending Perception directory Insights