mehd-io/pypi-duck-flow
end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence
This project helps data professionals understand usage patterns for Python projects on PyPI. It processes raw PyPI download logs, cleans and transforms them into meaningful metrics, and then presents these insights in an interactive dashboard. Data engineers or analysts can use this to monitor their projects or analyze the Python package ecosystem.
234 stars.
Use this if you need to build an end-to-end pipeline to gather, process, and visualize data about PyPI package downloads, especially if you work with Python, SQL, and DuckDB.
Not ideal if you're looking for a simple, pre-built web service to query PyPI stats without setting up any data infrastructure.
Stars
234
Forks
36
Language
TypeScript
License
—
Category
Last pushed
Mar 16, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/mehd-io/pypi-duck-flow"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
growthbook/growthbook
Open Source Feature Flags, Experimentation, and Product Analytics
koopjs/koop
Transform, query, and download geospatial data on the web.
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.