Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs update] Specify off-topic: what is NOT risk data #237

Open
matamadio opened this issue Sep 1, 2023 · 6 comments
Open

[Docs update] Specify off-topic: what is NOT risk data #237

matamadio opened this issue Sep 1, 2023 · 6 comments
Assignees
Labels
Docs This issue relates to documentation

Comments

@matamadio
Copy link
Contributor

What is the context or reason for the change?

After discussion with fellows, we recognize the need to specify which kind of data are not directly risk-related and as such are not meant for the RDLS - and on the other side, the schema might be unable to represent them properly.

The following is a list of common items that might be present in risk analytics as ancillary data (e.g. for visualisation), but are not strictly risk data:

  • DTM/DEM
  • River network, geology, other natural features
  • Political boundaries, catchment boundaries (only make sense if they include risk attributes)
  • Ortophotos and raw sat images (i.e. not yet elaborated into a hazard or exp model)
  • raw census data (i.e. not elaborated into a vulnerability model)
  • general climate data (avoid overlap with CCKP and other climate portals)

Please add any I might forget.

What is your proposed change?

Add a section in the documentation or in the website to explain which data are not meant for the RDL.

@matamadio matamadio added the Docs This issue relates to documentation label Sep 1, 2023
@duncandewhurst
Copy link
Contributor

Sounds good to me, suggest adding this content to a new 'What's not in scope?' page at the end of the introduction.

@pzwsk
Copy link
Contributor

pzwsk commented Sep 3, 2023

Any Base data from the OpenDRI Index:

  • DEM
  • Aerial imagery
  • Water bodies
  • Soil map
  • Admin boundaries
  • Watershed boundaries

image

@pzwsk
Copy link
Contributor

pzwsk commented Sep 3, 2023

I believe there is a grey zone for ALL exposure data, no? and maybe value in describing raw data that was used in the case there is no pre-existing exposure ready dataset?

@duncandewhurst
Copy link
Contributor

From @stufraser1 on today's check-in call - these are not risk datasets, but they can be listed in the sources section of RDLS metadata for datasets that use these as sources. The suggested page can include an explanation of the use of sources and an example in JSON/tabular format showing the risk dataset title and description and the sources field.

@stufraser1, @matamadio and @pzwsk to agree on content.

@matamadio
Copy link
Contributor Author

matamadio commented Sep 5, 2023

Draft text to be included in the page, for revision (@stufraser1):

Risk analytics often require ancillary data to develop hazard, exposure and vulnerability data and to provide context for mapping and visualization. These may be geospatial or non-geospatial data.

The following list of data are commonly part of the data package produced within a risk assessment project, but are not themselves considered risk data. As such, these elements are not meant to be described using the RDLS, but by using the source object, they can be included as a source for the risk data produced from them.
These elements include:

Basemap data

  • DTM and DEM are not to be described using RDLS unless they are filtered and processed into hazard zones. Risk datasets using a DTM as input can cite the original DTM as source dataset.
  • Water bodies, soil, geology, other natural features; generally these are inputs to a hazard dataset, and can be included as a 'source' but shouldn't be described using RDLS.
  • Administrative boundaries, watersheds or other kind of boundaries - unless a risk attribute is represented by attributes in those boundaries, e.g., risk scores by admin unit fit into the loss schema.
  • Orthophotos, aerial and satellite imagery; unless elaborated into a proper hazard or exposure model, e.g., building footprints obtained from image interpretation can fit into the exposure schema.

General socio-economic indicators

  • Raw census data; unless attributes are elaborated into a vulnerability model or index, e.g., principal component analysis of census variables, then it can fit into the vulnerability schema. Census and socio-economic information commonly appears in exposure datasets with other information to describe the population. The final exposure dataset would be described with RDLS with aource data referenced in the 'source' object.
  • General climate data; the RDL avoids overlap with dedicated climate portals such as World Bank CCKP.

@matamadio
Copy link
Contributor Author

matamadio commented Sep 7, 2023

There's already some non-risk data in the RDL collection, such as https://datacatalog.worldbank.org/int/search/dataset/0064179/Central-Asia-river-network--MERIT-Hydro-data-

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs This issue relates to documentation
Projects
Status: Under discussion
Development

No branches or pull requests

4 participants