starlake-ai/starlake
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
This tool helps data professionals define, extract, load, transform, and orchestrate their data pipelines using simple, declarative text files instead of complex code. You provide specifications for what data to move, how to clean it, and where it should go, and the system handles the technical implementation across various databases, data lakes, and orchestrators. It's designed for data analysts and engineers managing complex data workflows.
190 stars.
Use this if you are a data professional who wants to build, manage, and automate data pipelines with clear, concise configuration files instead of writing extensive custom code.
Not ideal if you prefer to write all your data processing logic using traditional programming languages like Python or Spark directly.
Stars
190
Forks
27
Language
Scala
License
Apache-2.0
Category
Last pushed
Mar 18, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/starlake-ai/starlake"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
growthbook/growthbook
Open Source Feature Flags, Experimentation, and Product Analytics
koopjs/koop
Transform, query, and download geospatial data on the web.
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.