dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.
Managing data workflows can be complex, especially when you have many interdependent steps like pulling data, cleaning it, running models, and generating reports. This tool helps data professionals, data engineers, and ML engineers define these steps as 'data assets,' connect them, and run them reliably. You input your Python code that defines these assets, and it orchestrates their execution, showing you the lineage and status of all your data.
15,121 stars. Used by 2 other packages. Actively maintained with 243 commits in the last 30 days. Available on PyPI.
Use this if you need to build, run, and monitor complex data pipelines that produce structured data, machine learning models, or analytical reports.
Not ideal if your workflow involves only simple, one-off data tasks or if you're not working with Python.
Stars
15,121
Forks
2,028
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 18, 2026
Commits (30d)
243
Dependencies
29
Reverse dependents
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/dagster-io/dagster"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Recent Releases
Related tools
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
growthbook/growthbook
Open Source Feature Flags, Experimentation, and Product Analytics
koopjs/koop
Transform, query, and download geospatial data on the web.
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
dlt-hub/dlt
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️