osdg-ai/osdg-data
The OSDG Community Dataset (OSDG-CD) is a public dataset of thousands of text excerpts, validated by OSDG Community Platform (OSDG-CP) citizen scientists with respect to the Sustainable Development Goals (SDGs). The dataset is updated every quarter and published on Zenodo.
The OSDG Community Dataset provides a collection of text excerpts that have been categorized by volunteers according to the UN Sustainable Development Goals (SDGs). This dataset, updated quarterly, is useful for researchers and data analysts who need pre-labeled textual data to explore relationships between text and specific SDG goals. It helps practitioners analyze content in the context of global sustainability efforts.
No commits in the last 6 months.
Use this if you need a pre-classified dataset of text excerpts related to the UN Sustainable Development Goals for research, model training, or detailed analysis.
Not ideal if you are looking for a tool to automatically classify your own documents into SDGs, as this project provides the dataset, not the classification system itself.
Stars
38
Forks
9
Language
—
License
GPL-3.0
Category
Last pushed
Oct 02, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/osdg-ai/osdg-data"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-edge-platform/datumaro
Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage...
explosion/ml-datasets
🌊 Machine learning dataset loaders for testing and example scripts
webdataset/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with...
tensorflow/datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
mlcommons/croissant
Croissant is a high-level format for machine learning datasets that brings together four rich layers.