dagster-io/dagster

An orchestration platform for the development, production, and observation of data assets.

84
/ 100
Verified

Managing data workflows can be complex, especially when you have many interdependent steps like pulling data, cleaning it, running models, and generating reports. This tool helps data professionals, data engineers, and ML engineers define these steps as 'data assets,' connect them, and run them reliably. You input your Python code that defines these assets, and it orchestrates their execution, showing you the lineage and status of all your data.

15,121 stars. Used by 2 other packages. Actively maintained with 243 commits in the last 30 days. Available on PyPI.

Use this if you need to build, run, and monitor complex data pipelines that produce structured data, machine learning models, or analytical reports.

Not ideal if your workflow involves only simple, one-off data tasks or if you're not working with Python.

data-engineering MLOps data-orchestration data-pipelines data-observability
Maintenance 25 / 25
Adoption 12 / 25
Maturity 25 / 25
Community 22 / 25

How are scores calculated?

Stars

15,121

Forks

2,028

Language

Python

License

Apache-2.0

Last pushed

Mar 18, 2026

Commits (30d)

243

Dependencies

29

Reverse dependents

2

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/dagster-io/dagster"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.