All Data Engineering Tools
1,297 tools ranked by quality score · Page 5 of 13
| # | Tool | Score | Tier |
|---|---|---|---|
| 401 |
turbot/steampipe-plugin-ldap
Use SQL to instantly query users, groups, OUs and more from LDAP. Open... |
|
Emerging |
| 402 |
AbstractionsLab/satrap-dl
SATRAP-DL (Semi-Automated Threat Reconnaissance and Analysis Powered by... |
|
Emerging |
| 403 |
Edwardvaneechoud/pyfloe
A minimal zero dependency dataframe library |
|
Emerging |
| 404 |
ivszhuravlev/spark-tuning-handbook
Hands-on Spark internals and performance engineering. |
|
Emerging |
| 405 |
Oatapza/libredb-studio
🛠️ Build and manage SQL databases effortlessly with LibreDB Studio, the... |
|
Emerging |
| 406 |
jrlasak/awesome-databricks
170+ curated resources every Databricks Data Engineer should bookmark -... |
|
Emerging |
| 407 |
OpenAF/oafp
Command-line tool that takes an input, usually a data structure (e.g. json),... |
|
Emerging |
| 408 |
synmetrix/synmetrix
Synmetrix – production-ready open source semantic layer on Cube |
|
Emerging |
| 409 |
markusbegerow/data-analytics-exercises
End-to-end data warehouse exercises for students - build a modern ELT... |
|
Emerging |
| 410 |
neokd/DataStorehouse
DataStoreHouse is an open-source project that aims to create a collaborative... |
|
Emerging |
| 411 |
turbot/steampipe-plugin-twitter
Use SQL to instantly query tweets, users and followers from Twitter. Open... |
|
Emerging |
| 412 |
hasna/connectors
Open source API connectors |
|
Emerging |
| 413 |
VincenzoImp/job-search-tool
Automated job search and analysis tool powered by JobSpy. Features... |
|
Emerging |
| 414 |
crackcell/hpipe
Workflow engine for various computing systems. |
|
Emerging |
| 415 |
mahmoudparsian/data-warehousing
This repository is a place for the Data Warehousing course at the... |
|
Emerging |
| 416 |
turbot/steampipe-plugin-googlesheets
Use SQL to instantly query spreadsheets, sheets, and cell data from Google... |
|
Emerging |
| 417 |
mindsdb/dbt-mindsdb
dbt adapter for connecting to MindsDB |
|
Emerging |
| 418 |
Canner/vulcan-sql
Data API Framework for AI Agents and Data Apps |
|
Emerging |
| 419 |
fairtracks/omnipy
Omnipy is a high level Python library for type-driven data wrangling and... |
|
Emerging |
| 420 |
wtbates99/tabletalk
tabeltalk is a declarative language for seamless interaction with your... |
|
Emerging |
| 421 |
govtech-data-practice/vowl
A validation engine for Open Data Contract Standard (ODCS) data contracts.... |
|
Emerging |
| 422 |
DataKitchen/dataops-observability-agents
DataOps Observability Integration Agents are part of DataKitchen's Open... |
|
Emerging |
| 423 |
nshkrdotcom/flowstone
Asset-first data orchestration for Elixir/BEAM. Dagster-inspired with OTP... |
|
Emerging |
| 424 |
Vetdatahub/VetDataHub
VetDataHub is an opensource veterinary datasets repository dedicated to... |
|
Emerging |
| 425 |
sevapru/terrorblade
A unified data extraction and parsing platform for messaging platforms. It... |
|
Emerging |
| 426 |
Amber-Williams/hackernews-whos-hiring
Real-time SQL database from Hacker News "hiring" thread |
|
Emerging |
| 427 |
tenzir/library
Packages for the Tenzir ecosystem. |
|
Emerging |
| 428 |
nightmarewalker/D-MemFS
In-process virtual filesystem with hard quota for Python |
|
Emerging |
| 429 |
mlr-org/mlr3db
Data Backends to let mlr3 work transparently with (remote) data bases |
|
Emerging |
| 430 |
AbdullahEmad22/realtime-data-engineering-project
An end-to-end data engineering pipeline that orchestrates data ingestion,... |
|
Emerging |
| 431 |
AlvaroCavalcante/airflow-parse-bench
Stop creating bad DAGs! Use this tool to measure and compare the parse time... |
|
Emerging |
| 432 |
continuous-dems/fetchez
Fetchez is a lightweight, modular, and highly extendable Python framework... |
|
Emerging |
| 433 |
prefeitura-rio/pipelines_rj_smtr
Códigos de captura e tratamento de dados da SMTR |
|
Emerging |
| 434 |
exasol/exasol-personal
The High-Performance Analytics Engine — Free for Personal Use |
|
Emerging |
| 435 |
turbot/steampipe-plugin-virustotal
Use SQL to instantly query file, domain, URL and IP scanning results from VirusTotal. |
|
Emerging |
| 436 |
stitchfix/hamilton
A scalable general purpose micro-framework for defining dataflows. THIS... |
|
Emerging |
| 437 |
onlozanoo/databroom
Databroom is a cross-language data cleaning tool with CLI, GUI, and API.... |
|
Emerging |
| 438 |
kasztp/dbx-exam-guide
Databricks Certifications - Exam prep guide |
|
Emerging |
| 439 |
root-11/tablite
multiprocessing enabled out-of-memory data analysis library for tabular data. |
|
Emerging |
| 440 |
gopidesupavan/qualink
Data quality validation, profiling, anomaly detection, and YAML-driven... |
|
Emerging |
| 441 |
ottogroup/koality
Library for data quality monitoring based on duckdb. |
|
Emerging |
| 442 |
BEKO2210/World_report
A self-updating global dashboard that aggregates 40+ open data sources... |
|
Emerging |
| 443 |
SunnyX6/Datapillar
Raw In, Golden Wings Out |
|
Emerging |
| 444 |
rush-db/rushdb
RushDB is an Instant Database for Modern Apps & AI. Built on top of Neo4j. |
|
Emerging |
| 445 |
drake69/spendify
🏦 Personal finance ledger — aggregates bank statements (CSV/XLSX) into a... |
|
Emerging |
| 446 |
BigData-Ananlysiser/UGC-Analysiser
一个开源的全栈大数据项目,主要包含实时数据采集/机器学习/大数据处理/前端可视化 |
|
Emerging |
| 447 |
aasouzaconsult/portfolio-dados
Repositório de Projetos em Análises de Dados (buscando valor em dados!!!) |
|
Emerging |
| 448 |
Thyznol/firefly-iii-Pico-Data-Importer
The Firefly III Data Importer can import data into Firefly, Automatically... |
|
Emerging |
| 449 |
chnm/bom
Website files, database GUI, and data pipeline scripts for the London Bills... |
|
Emerging |
| 450 |
bogwi/sarpro
Blazing-fast Sentinel‑1 Synthetic Aperture Radar (SAR) GRD to GeoTIFF/JPEG... |
|
Emerging |
| 451 |
kameshsampath/postgis-snowflake-intelligence-demo
This demo showcases a production-ready architecture for managing smart city... |
|
Emerging |
| 452 |
vedanthv/data-engineering-portfolio
Cool DE Projects |
|
Emerging |
| 453 |
mbari-org/aidata
(ETL) Extract, transform, load/download and augment images and annotations... |
|
Emerging |
| 454 |
polarbase-team/polarbase
Extensible Open-source Data Backend for PostgreSQL. Features a multi-view UI... |
|
Emerging |
| 455 |
jtakish/airflow-provider-sap-hana
Airflow provider package for SAP HANA |
|
Emerging |
| 456 |
atolcd/sdis-remocra
🔥 Remocra - Plateforme métier opensource conçue par et pour les SDIS. |
|
Emerging |
| 457 |
polakowo/datadocs
Documentation for data enthusiasts |
|
Emerging |
| 458 |
kevin-hanselman/dud
A lightweight CLI tool for versioning data alongside source code and... |
|
Emerging |
| 459 |
savantly-net/nexus-command
FOSS ERP - data management, automation, and integration for any business.... |
|
Emerging |
| 460 |
empowerai/fs-middlelayer-api
US Forest Service ePermit API |
|
Emerging |
| 461 |
SwellDB/SwellDB
The data system that answers anything. |
|
Emerging |
| 462 |
bitroot/coflux
Open-source workflow engine. Orchestrate and observe computational workflows... |
|
Emerging |
| 463 |
sergio11/covid_tweets_etl_architecture
📚🧪 This is a learning-focused POC that explores a microservices ETL... |
|
Emerging |
| 464 |
SentryPeer/SentryPeerHQ
Fraud Detection for VoIP. Use SentryPeer® HQ to help prevent VoIP... |
|
Emerging |
| 465 |
viadee/camunda-kafka-polling-client
Stream your process history to Kafka |
|
Emerging |
| 466 |
tshu-w/DBCopilot
Code and data for the paper "DBCᴏᴘɪʟᴏᴛ: Natural Language Querying over... |
|
Emerging |
| 467 |
astronomer/cosmos-ebook-companion
Companion repository to the Practical Guide: Orchestrating dbt with Apache... |
|
Emerging |
| 468 |
cderickson/Mox-Data.com
Mox-Data.com is a cloud-based data ingestion tool used to process raw data... |
|
Emerging |
| 469 |
pkochanowicz/n8n-setup-docker
Fast, safe and smart setup for self-hosted n8n placed in a Docker container,... |
|
Emerging |
| 470 |
Bread-Technologies/Bread-Dataset-Viewer
VS Code extension to easily view and handle large datasets. Look at... |
|
Emerging |
| 471 |
ErcinDedeoglu/Postalized
The ultimate address parsing tool. Effortlessly parse and expand postal data... |
|
Emerging |
| 472 |
Zipstack/visitran
Modern, AI-native and agentic Pythonic data transformation platform. |
|
Emerging |
| 473 |
bruin-data/setup-bruin
Official action to install Bruin CLI in Github Actions. |
|
Emerging |
| 474 |
GSA/coe-hud-acquisitions
A repository that contains links and information for acquisitions and... |
|
Emerging |
| 475 |
provero-org/provero
Declarative data quality engine. Define checks in YAML, run anywhere. |
|
Emerging |
| 476 |
jamie-steele/dockpipe
Run, isolate, and act — pipe commands into disposable containers and process... |
|
Emerging |
| 477 |
richban/opendata-stack-platform
Open Data Stack Platform: a collection of projects and pipelines built with... |
|
Emerging |
| 478 |
peter115342/soccer-tracker-DE-project
End-To-End Data Engineering Project. Made to learn some common data... |
|
Emerging |
| 479 |
equitusai/arcxa
Mapping intelligence for enterprise data migrations: schema mapping,... |
|
Emerging |
| 480 |
paulnamalomba/datashadric
datashadric provides a collection of well-organized modules for common data... |
|
Emerging |
| 481 |
ramiradwan/onlyfans-conversational-analytics
provides a unified view of conversation and analysis data to help you... |
|
Emerging |
| 482 |
Bigdata-com/bigdata-briefs
Generate briefs based on financially relevant information from Bigdata.com |
|
Emerging |
| 483 |
apache/seatunnel-tools
SeaTunnel is a multimodal, high-performance, distributed, massive data... |
|
Emerging |
| 484 |
dubbl-org/dubbl
A full-featured, open-source alternative to Xero and QuickBooks. It is... |
|
Emerging |
| 485 |
sicara/sicarator
Instant Setup & Best Quality for Data Projects! |
|
Emerging |
| 486 |
Smart-Shaped/chaM3Leon
By Smart Shaped s.r.l. (https://www.smartshaped.com/) |
|
Emerging |
| 487 |
TJAdryan/astro_blog
This site uses the amazing Astro.build project. I added **Google Docs** ... |
|
Emerging |
| 488 |
The-Pulse-Engine/Pulse-Engine_Market_Intelligence_Platform
An explainable market analysis system that combines technical indicators and... |
|
Emerging |
| 489 |
jroakes/SEODP
The SEO Data Platform automates SEO analysis, aggregating data from Google... |
|
Emerging |
| 490 |
altamsh04/deafso-backend
A scalable backend for DeafSo (Capstone) |
|
Emerging |
| 491 |
MTSWebServices/horizon
Simple HWM Store backend |
|
Emerging |
| 492 |
PHACDataHub/data-mesh-ref-impl
Data Mesh Reference Implementation with standalone example use cases |
|
Emerging |
| 493 |
turbot/steampipe-plugin-digitalocean
Use SQL to instantly query droplets, VPCs, users and more from DigitalOcean.... |
|
Emerging |
| 494 |
turbot/steampipe-plugin-openapi
Use SQL to instantly query resources from OpenAPI. Open source CLI. No DB required. |
|
Emerging |
| 495 |
PkLavc/PkLavc.github.io
PkLavc Portfolio | Solutions & Integration Architect (Technical Owner).... |
|
Emerging |
| 496 |
turbot/steampipe-plugin-imap
Use SQL to instantly query mailboxes, messages and more using IMAP. Open... |
|
Emerging |
| 497 |
benzsevern/goldencheck
Data validation that discovers rules from your data. 19 MCP tools on... |
|
Emerging |
| 498 |
limhaneul12/kafka-gov
Open-Source Apache Kafka Governance Platform |
|
Emerging |
| 499 |
Codex-Crusader/le_Market_Intelligence_Platform
An explainable market analysis system that combines technical indicators and... |
|
Emerging |
| 500 |
justvinhhere/bigquery-expert
Claude Code plugin that makes Claude a BigQuery expert. 5 skills covering... |
|
Emerging |