lezwon/CatalystOps

Semantic cost-linting and performance warnings extension for Databricks in VS Code

/ 100

Experimental

This tool helps data engineers and scientists who write PySpark code for Databricks identify and fix performance and cost issues early. It takes your PySpark notebook code as input and provides immediate feedback on potential problems like inefficient data operations or schema mismatches, helping you write more optimized and cheaper Spark jobs.

Use this if you are a data engineer or data scientist developing PySpark applications on Databricks and want to prevent common performance bottlenecks and control cloud spend before deploying your code.

Not ideal if you are not using PySpark, Databricks, or if you prefer to debug performance issues only after jobs have run in production.

data-engineering pyspark-optimization databricks-cost-management etl-pipeline-performance streaming-data-quality

No Package No Dependents

Maintenance 13 / 25

Adoption 5 / 25

Maturity 11 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

TypeScript

License

—

Higher-rated alternatives

PrefectHQ/prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

growthbook/growthbook

Open Source Feature Flags, Experimentation, and Product Analytics

koopjs/koop

Transform, query, and download geospatial data on the web.

pathwaycom/pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

dagster-io/dagster

An orchestration platform for the development, production, and observation of data assets.

Explore Data Engineering Tools

All categories Trending Data Engineering directory Insights