AndreaBozzo/Ceres
Harvesting & Semantic search for open data portals
This tool helps government agencies, researchers, or data journalists collect, organize, and keep track of publicly available datasets from various open data portals. It takes raw metadata from portals like CKAN and DCAT, processes it for consistency, and stores it in a local catalog. The output is a clean, searchable database of datasets that stays up-to-date, making it easier to find relevant information without needing to visit multiple sources.
Use this if you need a reliable way to continuously gather and manage metadata from multiple open data portals and want to easily search or export that information.
Not ideal if you only need to download a few static datasets occasionally or are looking for a tool to host your own open data portal.
Stars
12
Forks
3
Language
Rust
License
Apache-2.0
Category
Last pushed
Mar 21, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/AndreaBozzo/Ceres"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
growthbook/growthbook
Open Source Feature Flags, Experimentation, and Product Analytics
koopjs/koop
Transform, query, and download geospatial data on the web.
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.