moj-analytical-services/etl_manager
A python package to create a database on the platform using our moj data warehousing framework
This tool helps data engineers and analysts define the structure of their datasets stored in Amazon S3, making them easily queryable with SQL through Amazon Athena. You provide descriptions of your data files (like CSVs or Parquet files) and their columns, and it sets up the necessary metadata. The output is a defined data catalog in AWS Glue, allowing for straightforward SQL querying of your S3 data.
Use this if you need to create and manage schemas for your analytical datasets in AWS S3 and make them accessible for SQL queries using Amazon Athena, without manually configuring AWS Glue.
Not ideal if you do not use AWS S3 and Athena for your data storage and querying, or if you need robust data validation and conflict checking for column properties like patterns and enums.
Stars
21
Forks
11
Language
Python
License
—
Category
Last pushed
Mar 16, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/moj-analytical-services/etl_manager"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
growthbook/growthbook
Open Source Feature Flags, Experimentation, and Product Analytics
koopjs/koop
Transform, query, and download geospatial data on the web.
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.