Codex56799/dataengineering
🚀 Build a containerized data engineering workflow for NYC Yellow Taxi Trip Data using Apache Spark, Airflow, MinIO, and DuckDB, fully reproducible with Docker.
Stars
1
Forks
—
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Mar 19, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/Codex56799/dataengineering"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
elyra-ai/pipeline-editor
Common pipeline-editor components used in different clients (e.g. Elyra application, Web browser...
orchest/orchest
Build data pipelines, the easy way 🛠️
stitchfix/hamilton
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN...
chayansraj/Python-ETL-pipeline-using-Airflow-on-AWS
This project demonstrates how to build and automate an ETL pipeline written in Python and...
msmenegol/datapark
Datapark: a self-hosted data platform