datazip-inc/olake
OLake - Fastest Databases, Kafka & S3 Replication to Apache Iceberg or Plain Parquet. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supported sources : Postgres, MongoDB, MySQL, Oracle, MSSql, DB2, Kafka, S3.
OLake helps data engineers and analysts move large amounts of data quickly from various databases like PostgreSQL, MySQL, MongoDB, Oracle, and even Amazon S3 or Kafka, into a unified data lakehouse built on Apache Iceberg or Parquet. It streamlines the creation of real-time analytical pipelines by taking your raw operational data and transforming it into a format optimized for fast querying and analysis, without requiring complex infrastructure setups.
1,310 stars. Actively maintained with 42 commits in the last 30 days.
Use this if you need to rapidly ingest data from multiple transactional databases, event streams, or object storage into an Apache Iceberg or Parquet data lake for real-time analytics, with minimal infrastructure overhead.
Not ideal if your primary need is data transformation or orchestration of complex data workflows beyond simple replication.
Stars
1,310
Forks
210
Language
Go
License
Apache-2.0
Category
Last pushed
Mar 19, 2026
Commits (30d)
42
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/datazip-inc/olake"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
growthbook/growthbook
Open Source Feature Flags, Experimentation, and Product Analytics
koopjs/koop
Transform, query, and download geospatial data on the web.
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.