sodadata/soda-core
Data Contracts engine for the modern data stack. https://www.soda.io
This tool helps data professionals ensure the accuracy and reliability of their datasets. It allows you to define "data contracts" in a human-readable YAML format, specifying expected schema and data quality rules for tables in your data warehouse. You input these contract definitions and connect to your data sources (like Snowflake or BigQuery), and the tool automatically verifies if your actual data adheres to these quality standards, alerting you to any discrepancies.
2,310 stars. Actively maintained with 31 commits in the last 30 days.
Use this if you need a systematic way to define and automatically validate the quality and structure of data moving through your data pipelines or residing in your data warehouses.
Not ideal if you are looking for a visual, no-code interface for data quality monitoring, as this tool primarily uses YAML configurations and a command-line interface.
Stars
2,310
Forks
259
Language
Python
License
—
Category
Last pushed
Mar 18, 2026
Commits (30d)
31
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/sodadata/soda-core"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
growthbook/growthbook
Open Source Feature Flags, Experimentation, and Product Analytics
koopjs/koop
Transform, query, and download geospatial data on the web.
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.