bacalhau-project/bacalhau

Community-driven, simple, yet powerful framework for fast, cost-effective distributed Compute over Data.

59
/ 100
Established

This framework helps data scientists, machine learning engineers, and operations teams process extremely large datasets without needing to move them. You provide your data and the computations you want to run, and it orchestrates the execution directly where the data resides. This eliminates costly data transfers and speeds up processing for tasks like log analysis or distributed model training.

853 stars.

Use this if you need to perform computations on massive datasets distributed across different locations and want to minimize data movement and network egress costs.

Not ideal if your data is small, centralized, and can be easily moved to a single processing unit or if you require real-time, ultra-low-latency processing for interactive applications.

distributed-data-processing edge-computing big-data-analytics machine-learning-operations cloud-cost-optimization
No Package No Dependents
Maintenance 13 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

853

Forks

101

Language

Go

License

Apache-2.0

Last pushed

Mar 28, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/bacalhau-project/bacalhau"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.