Clarify no-data value / null for most processes #480

Open-EO · Jan 3, 2024 · a3710ad · a3710ad
1 parent d5d0a18
commit a3710ad
Show file tree

Hide file tree

Showing 94 changed files with 186 additions and 302 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -8,21 +8,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Changed
 
-- `clip`: Throw an exception if min > max [#472](https://github.com/Open-EO/openeo-processes/issues/472)
+- Clarified for various mathematical functions the defined input and output ranges. Mention that `NaN` is returned outside of the defined input range where possible.
+- Clarified for various processes the handling of no-data values and null, see also the [implementation guide](meta/implementation.md).  [#480](https://github.com/Open-EO/openeo-processes/issues/480)
 - Added a uniqueness contraint to various array-typed parameters (e.g. lists of dimension names or labels)
+- `array_interpolate_linear`: Apply interpolation to NaN and no-data values.
+- `clip`: Throw an exception if min > max. [#472](https://github.com/Open-EO/openeo-processes/issues/472)
 
 ### Fixed
 
-- Clarified for various mathematical functions the defined input and output ranges. Mention that `NaN` is returned outside of the defined input range where possible.
 - `aggregate_temporal` and `aggregate_temporal_period`: Clarified that the process throws a `DimensionNotAvailable` exception when no temporal dimension exists.
-- `aggregate_temporal_period`: Removed unused exception `DistinctDimensionLabelsRequired`
-- `aggregate_temporal_period`: Clarified that the definition of weeks follows ISO 8601
-- `divide`: Clarified behavior for division by 0
+- `aggregate_temporal_period`: Removed unused exception `DistinctDimensionLabelsRequired`.
+- `aggregate_temporal_period`: Clarified that the definition of weeks follows ISO 8601.
+- `divide`: Clarified behavior for division by 0.
 - `between`: Clarify that `null` is passed through.
 - `eq` and `neq`: Explicitly set the minimum value for the `delta` parameter.
 - `filter_bbox`, `load_collection`, `load_stac`: Clarified that the bounding box is reprojected to the CRS of the spatial data cube dimensions if required.
 - `filter_spatial`: Clarified that masking is applied using the given geometries. [#469](https://github.com/Open-EO/openeo-processes/issues/469)
-- `mod`: Clarified behavior for y = 0
+- `mod`: Clarified behavior for y = 0.
 - `sqrt`: Clarified that NaN is returned for negative numbers.
 
 ## [2.0.0-rc.1] - 2023-05-25

diff --git a/absolute.json b/absolute.json
@@ -1,7 +1,7 @@
 {
     "id": "absolute",
     "summary": "Absolute value",
-    "description": "Computes the absolute value of a real number `x`, which is the \"unsigned\" portion of `x` and often denoted as *|x|*.\n\nThe no-data value `null` is passed through and therefore gets propagated.",
+    "description": "Computes the absolute value of a real number `x`, which is the \"unsigned\" portion of `x` and often denoted as *|x|*.\n\nNo-data values are passed through and therefore get propagated.",
     "categories": [
         "math"
     ],

diff --git a/add.json b/add.json
@@ -1,7 +1,7 @@
 {
     "id": "add",
     "summary": "Addition of two numbers",
-    "description": "Sums up the two numbers `x` and `y` (*`x + y`*) and returns the computed sum.\n\nNo-data values are taken into account so that `null` is returned if any element is such a value.\n\nThe computations follow [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229) whenever the processing environment supports it.",
+    "description": "Sums up the two numbers `x` and `y` (*`x + y`*) and returns the computed sum.\n\nNo-data values are taken into account so that the no-data value is returned if any element is such a value.\n\nThe computations follow [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229) whenever the processing environment supports it.",
     "categories": [
         "math"
     ],
@@ -88,4 +88,4 @@
             "result": true
         }
     }
-}
+}
diff --git a/aggregate_temporal.json b/aggregate_temporal.json
@@ -89,7 +89,7 @@
         },
         {
             "name": "reducer",
-            "description": "A reducer to be applied for the values contained in each interval. A reducer is a single process such as ``mean()`` or a set of processes, which computes a single value for a list of values, see the category 'reducer' for such processes. Intervals may not contain any values, which for most reducers leads to no-data (`null`) values by default.",
+            "description": "A reducer to be applied for the values contained in each interval. A reducer is a single process such as ``mean()`` or a set of processes, which computes a single value for a list of values, see the category 'reducer' for such processes. Intervals may not contain any values, which leads to a no-data value for most reducers by default.",
             "schema": {
                 "type": "object",
                 "subtype": "process-graph",

diff --git a/aggregate_temporal_period.json b/aggregate_temporal_period.json
@@ -42,7 +42,7 @@
         },
         {
             "name": "reducer",
-            "description": "A reducer to be applied for the values contained in each period. A reducer is a single process such as ``mean()`` or a set of processes, which computes a single value for a list of values, see the category 'reducer' for such processes. Periods may not contain any values, which for most reducers leads to no-data (`null`) values by default.",
+            "description": "A reducer to be applied for the values contained in each period. A reducer is a single process such as ``mean()`` or a set of processes, which computes a single value for a list of values, see the category 'reducer' for such processes. Periods may not contain any values, which leads to a no-data value for most reducers by default.",
             "schema": {
                 "type": "object",
                 "subtype": "process-graph",

diff --git a/all.json b/all.json
@@ -1,7 +1,7 @@
 {
     "id": "all",
     "summary": "Are all of the values true?",
-    "description": "Checks if **all** of the values in `data` are true. If no value is given (i.e. the array is empty) the process returns `null`.\n\nBy default all no-data values are ignored so that the process returns `null` if all values are no-data, `true` if all values are true and `false` otherwise. Setting the `ignore_nodata` flag to `false` takes no-data values into account and the array values are reduced pairwise according to the following truth table:\n\n```\n      || null  | false | true\n----- || ----- | ----- | -----\nnull  || null  | false | null\nfalse || false | false | false\ntrue  || null  | false | true\n```\n\n**Remark:** The process evaluates all values from the first to the last element and stops once the outcome is unambiguous. A result is ambiguous unless a value is `false` or all values have been taken into account.",
+    "description": "Checks if **all** of the values in `data` are true. If no value is given (i.e. the array is empty) the process returns `null`.\n\nBy default all no-data values are ignored so that the process returns the no-data value (or `null`) if all values are no-data, `true` if all values are true and `false` otherwise. Setting the `ignore_nodata` flag to `false` takes no-data values into account and the array values are reduced pairwise according to the following truth table:\n\n```\n        || no-data | false | true\n------- || ------- | ----- | -------\nno-data || no-data | false | no-data\nfalse   || false   | false | false\ntrue    || no-data | false | true\n```\n\n**Remark:** The process evaluates all values from the first to the last element and stops once the outcome is unambiguous. A result is ambiguous unless a value is `false` or all values have been taken into account.",
     "categories": [
         "logic",
         "reducer"
@@ -22,7 +22,7 @@
         },
         {
             "name": "ignore_nodata",
-            "description": "Indicates whether no-data values are ignored or not and ignores them by default.",
+            "description": "Indicates whether no-data values are ignored or not. Ignores them by default.",
             "schema": {
                 "type": "boolean"
             },
@@ -131,4 +131,4 @@
             "returns": null
         }
     ]
-}
+}
diff --git a/and.json b/and.json
@@ -1,7 +1,7 @@
 {
     "id": "and",
     "summary": "Logical AND",
-    "description": "Checks if **both** values are true.\n\nEvaluates parameter `x` before `y` and stops once the outcome is unambiguous. If any argument is `null`, the result will be `null` if the outcome is ambiguous.\n\n**Truth table:**\n\n```\nx \\ y || null  | false | true\n----- || ----- | ----- | -----\nnull  || null  | false | null\nfalse || false | false | false\ntrue  || null  | false | true\n```",
+    "description": "Checks if **both** values are true.\n\nEvaluates parameter `x` before `y` and stops once the outcome is unambiguous. If any argument is a no-data value, the result will be the no-data value whenever the outcome is ambiguous.\n\n**Truth table:**\n\n```\nx \\ y   || no-data | false | true\n------- || ------- | ----- | -------\nno-data || no-data | false | no-data\nfalse   || false   | false | false\ntrue    || no-data | false | true\n```",
     "categories": [
         "logic"
     ],

diff --git a/any.json b/any.json
@@ -1,7 +1,7 @@
 {
     "id": "any",
     "summary": "Is at least one value true?",
-    "description": "Checks if **any** (i.e. at least one) value in `data` is `true`. If no value is given (i.e. the array is empty) the process returns `null`.\n\nBy default all no-data values are ignored so that the process returns `null` if all values are no-data, `true` if at least one value is true and `false` otherwise. Setting the `ignore_nodata` flag to `false` takes no-data values into account and the array values are reduced pairwise according to the following truth table:\n\n```\n      || null | false | true\n----- || ---- | ----- | ----\nnull  || null | null  | true\nfalse || null | false | true\ntrue  || true | true  | true\n```\n\n**Remark:** The process evaluates all values from the first to the last element and stops once the outcome is unambiguous. A result is ambiguous unless a value is `true`.",
+    "description": "Checks if **any** (i.e. at least one) value in `data` is `true`. If no value is given (i.e. the array is empty) the process returns `null`.\n\nBy default all no-data values are ignored so that the process returns the no-data value (or `null`) if all values are no-data, `true` if at least one value is true and `false` otherwise. Setting the `ignore_nodata` flag to `false` takes no-data values into account and the array values are reduced pairwise according to the following truth table:\n\n```\n        || no-data | false   | true\n------- || ------- | ------- | ----\nno-data || no-data | no-data | true\nfalse   || no-data | false   | true\ntrue    || true    | true    | true\n```\n\n**Remark:** The process evaluates all values from the first to the last element and stops once the outcome is unambiguous. A result is ambiguous unless a value is `true`.",
     "categories": [
         "logic",
         "reducer"
@@ -22,7 +22,7 @@
         },
         {
             "name": "ignore_nodata",
-            "description": "Indicates whether no-data values are ignored or not and ignores them by default.",
+            "description": "Indicates whether no-data values are ignored or not. Ignores them by default.",
             "schema": {
                 "type": "boolean"
             },
@@ -131,4 +131,4 @@
             "returns": null
         }
     ]
-}
+}
diff --git a/apply_neighborhood.json b/apply_neighborhood.json
@@ -1,7 +1,7 @@
 {
     "id": "apply_neighborhood",
     "summary": "Apply a process to pixels in a n-dimensional neighborhood",
-    "description": "Applies a focal process to a data cube.\n\nA focal process is a process that works on a 'neighborhood' of pixels. The neighborhood can extend into multiple dimensions, this extent is specified by the `size` argument. It is not only (part of) the size of the input window, but also the size of the output for a given position of the sliding window. The sliding window moves with multiples of `size`.\n\nAn overlap can be specified so that neighborhoods can have overlapping boundaries. This allows for continuity of the output. The overlap region must be included in the data cube or array returned by `process`, but any changed values will be ignored. The missing overlap at the borders of the original data cube is made available as no-data (`null`) in the sub-data cubes.\n\nThe neighborhood size should be kept small enough, to avoid running beyond computational resources, but a too-small size will result in a larger number of process invocations, which may slow down processing. Window sizes for spatial dimensions typically range from 64 to 512 pixels, while overlaps of 8 to 32 pixels are common.\n\nFor the special case of 2D convolution, it is recommended to use ``apply_kernel()``.",
+    "description": "Applies a focal process to a data cube.\n\nA focal process is a process that works on a 'neighborhood' of pixels. The neighborhood can extend into multiple dimensions, this extent is specified by the `size` argument. It is not only (part of) the size of the input window, but also the size of the output for a given position of the sliding window. The sliding window moves with multiples of `size`.\n\nAn overlap can be specified so that neighborhoods can have overlapping boundaries. This allows for continuity of the output. The overlap region must be included in the data cube or array returned by `process`, but any changed values will be ignored. The missing overlap at the borders of the original data cube is made available as no-data values in the sub-data cubes.\n\nThe neighborhood size should be kept small enough, to avoid running beyond computational resources, but a too-small size will result in a larger number of process invocations, which may slow down processing. Window sizes for spatial dimensions typically range from 64 to 512 pixels, while overlaps of 8 to 32 pixels are common.\n\nFor the special case of 2D convolution, it is recommended to use ``apply_kernel()``.",
     "categories": [
         "cubes"
     ],

diff --git a/arccos.json b/arccos.json
@@ -1,7 +1,7 @@
 {
     "id": "arccos",
     "summary": "Inverse cosine",
-    "description": "Computes the arc cosine of `x`. The arc cosine is the inverse function of the cosine so that *`arccos(cos(x)) = x`*.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated. `NaN` is returned for values outside of the allowed range.",
+    "description": "Computes the arc cosine of `x`. The arc cosine is the inverse function of the cosine so that *`arccos(cos(x)) = x`*.\n\nWorks on radians only.\nNo-data values are passed through and therefore get propagated. `NaN` is returned for values outside of the allowed range.",
     "categories": [
         "math > trigonometric"
     ],

diff --git a/arcosh.json b/arcosh.json
@@ -1,7 +1,7 @@
 {
     "id": "arcosh",
     "summary": "Inverse hyperbolic cosine",
-    "description": "Computes the inverse hyperbolic cosine of `x`. It is the inverse function of the hyperbolic cosine so that *`arcosh(cosh(x)) = x`*.\n\nThe no-data value `null` is passed through and therefore gets propagated. `NaN` is returned for values outside of the allowed range.",
+    "description": "Computes the inverse hyperbolic cosine of `x`. It is the inverse function of the hyperbolic cosine so that *`arcosh(cosh(x)) = x`*.\n\nNo-data values are passed through and therefore get propagated. `NaN` is returned for values outside of the allowed range.",
     "categories": [
         "math > trigonometric"
     ],

diff --git a/arcsin.json b/arcsin.json
@@ -1,7 +1,7 @@
 {
     "id": "arcsin",
     "summary": "Inverse sine",
-    "description": "Computes the arc sine of `x`. The arc sine is the inverse function of the sine so that *`arcsin(sin(x)) = x`*.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated. `NaN` is returned for values < -1 and > 1.",
+    "description": "Computes the arc sine of `x`. The arc sine is the inverse function of the sine so that *`arcsin(sin(x)) = x`*.\n\nWorks on radians only.\nNo-data values are passed through and therefore get propagated. `NaN` is returned for values < -1 and > 1.",
     "categories": [
         "math > trigonometric"
     ],

diff --git a/arctan.json b/arctan.json
@@ -1,7 +1,7 @@
 {
     "id": "arctan",
     "summary": "Inverse tangent",
-    "description": "Computes the arc tangent of `x`. The arc tangent is the inverse function of the tangent so that *`arctan(tan(x)) = x`*.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated.",
+    "description": "Computes the arc tangent of `x`. The arc tangent is the inverse function of the tangent so that *`arctan(tan(x)) = x`*.\n\nWorks on radians only.\nNo-data values are passed through and therefore get propagated.",
     "categories": [
         "math > trigonometric"
     ],

diff --git a/arctan2.json b/arctan2.json
@@ -1,7 +1,7 @@
 {
     "id": "arctan2",
     "summary": "Inverse tangent of two numbers",
-    "description": "Computes the arc tangent of two numbers `x` and `y`. It is similar to calculating the arc tangent of *`y / x`*, except that the signs of both arguments are used to determine the quadrant of the result.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated if any of the arguments is `null`.",
+    "description": "Computes the arc tangent of two numbers `x` and `y`. It is similar to calculating the arc tangent of *`y / x`*, except that the signs of both arguments are used to determine the quadrant of the result.\n\nWorks on radians only.\nIf any argument is a no-data value, the result will be the no-data value (or `null`).",
     "categories": [
         "math > trigonometric"
     ],
@@ -59,4 +59,4 @@
             "title": "Two-argument inverse tangent explained by Wikipedia"
         }
     ]
-}
+}
diff --git a/array_contains.json b/array_contains.json
@@ -1,7 +1,7 @@
 {
     "id": "array_contains",
     "summary": "Check whether the array contains a given value",
-    "description": "Checks whether the array specified for `data` contains the value specified in `value`. Returns `true` if there's a match, otherwise `false`.\n\n**Remarks:**\n\n* To get the index or the label of the value found, use ``array_find()``.\n* All definitions for the process ``eq()`` regarding the comparison of values apply here as well. A `null` return value from ``eq()`` is handled exactly as `false` (no match).\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*.\n* An integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`. Still, this process may return unexpectedly `false` when comparing floating-point numbers due to floating-point inaccuracy in machine-based computation.\n* Temporal strings are treated as normal strings and MUST NOT be interpreted.",
+    "description": "Checks whether the array specified for `data` contains the value specified in `value`. Returns `true` if there's a match, otherwise `false`.\n\n**Remarks:**\n\n* To get the index or the label of the value found, use ``array_find()``.\n* All definitions for the process ``eq()`` regarding the comparison of values apply here as well. A no-data return value from ``eq()`` is handled as `false` (no match).\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*.\n* An integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`. Still, this process may return unexpectedly `false` when comparing floating-point numbers due to floating-point inaccuracy in machine-based computation.\n* Temporal strings are treated as normal strings and MUST NOT be interpreted.\n\nSee the examples to check for no-data values.",
     "categories": [
         "arrays",
         "comparison",
@@ -20,7 +20,7 @@
         },
         {
             "name": "value",
-            "description": "Value to find in `data`. If the value is `null`, this process returns always `false`.",
+            "description": "Value to find in `data`. If the value is no-data value (or `null`), this process returns always `false`.",
             "schema": {
                 "type": [
                     "number",