AbsaOSS/pramen
Resilient data pipeline framework running on Apache Spark
Pramen helps data engineers and scientists build robust data pipelines and data lakes. It takes raw data from various sources like databases or files, transforms it, and delivers it to target systems, while handling common data challenges like late arrivals or schema changes. Data professionals use it to automate and manage complex data flows, from ingestion to analytics or machine learning model preparation.
Use this if you need to build and manage large-scale, resilient data pipelines for tabular data on a Spark ecosystem, focusing on automating data ingestion, transformation, and delivery with built-in recovery and orchestration.
Not ideal if your primary need is for streaming real-time data processing or if you are not operating within a Hadoop/Spark environment.
Stars
26
Forks
3
Language
Scala
License
Apache-2.0
Category
Last pushed
Mar 18, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/AbsaOSS/pramen"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
growthbook/growthbook
Open Source Feature Flags, Experimentation, and Product Analytics
koopjs/koop
Transform, query, and download geospatial data on the web.
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.