datazip-inc/olake

OLake - Fastest Databases, Kafka & S3 Replication to Apache Iceberg or Plain Parquet. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supported sources : Postgres, MongoDB, MySQL, Oracle, MSSql, DB2, Kafka, S3.

/ 100

Verified

OLake helps data engineers and analysts move large amounts of data quickly from various databases like PostgreSQL, MySQL, MongoDB, Oracle, and even Amazon S3 or Kafka, into a unified data lakehouse built on Apache Iceberg or Parquet. It streamlines the creation of real-time analytical pipelines by taking your raw operational data and transforming it into a format optimized for fast querying and analysis, without requiring complex infrastructure setups.

1,310 stars. Actively maintained with 42 commits in the last 30 days.

Use this if you need to rapidly ingest data from multiple transactional databases, event streams, or object storage into an Apache Iceberg or Parquet data lake for real-time analytics, with minimal infrastructure overhead.

Not ideal if your primary need is data transformation or orchestration of complex data workflows beyond simple replication.

data-lakehouse real-time-analytics data-ingestion database-replication data-engineering

No Package No Dependents

Maintenance 23 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

1,310

Forks

210

Language

License

Apache-2.0

Related tools

PrefectHQ/prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

growthbook/growthbook

Open Source Feature Flags, Experimentation, and Product Analytics

koopjs/koop

Transform, query, and download geospatial data on the web.

pathwaycom/pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

dagster-io/dagster

An orchestration platform for the development, production, and observation of data assets.

Explore Data Engineering Tools

All categories Trending Data Engineering directory Insights