From ee7de641a2d645987388ad9b3d2f571f3809691b Mon Sep 17 00:00:00 2001
From: Francesco Nattino <f.nattino@esciencecenter.nl>
Date: Thu, 3 Mar 2022 22:28:32 +0100
Subject: [PATCH] Ryan's comments - fix challenge

---
 _episodes/20-parallel-raster-computations.md | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/_episodes/20-parallel-raster-computations.md b/_episodes/20-parallel-raster-computations.md
index 1c589b50..aba46cbb 100644
--- a/_episodes/20-parallel-raster-computations.md
+++ b/_episodes/20-parallel-raster-computations.md
@@ -202,7 +202,7 @@ Xarray and Dask also provide a graphical representation of the raster data array
 >
 > In order to optimally access COGs it is best to align the blocksize of the file with the chunks employed when loading
 > the file. Open the blue-band asset ("B02") of a Sentinel-2 scene as a chunked `DataArray` object using a suitable
-> chunk size. Which elements do you think should be considered when choosing such value?
+> chunk size. Which elements do you think should be considered when choosing the chunk size?
 >
 > > ## Solution
 > > ~~~
@@ -218,18 +218,20 @@ Xarray and Dask also provide a graphical representation of the raster data array
 > > ~~~
 > > {: .output}
 > >
-> > Ideal values are thus multiples of 1024. An element to consider is the number of resulting chunks and their size.
-> > Chunks should not be too big nor too small (i.e. too many). Recommended chunk sizes are of the order of 100 MB.
-> > Also, the shape might be relevant, depending on the application! Here, we might select a chunks shape of
-> > `(1, 6144, 6144)`:
+> > Ideal chunk size values for this raster are thus multiples of 1024. An element to consider is the number of
+> > resulting chunks and their size. Chunks should not be too big nor too small (i.e. too many). As a rule of thumb,
+> > chunk sizes of 100 MB typically work well with Dask (see, e.g., this
+> > [blog post](https://blog.dask.org/2021/11/02/choosing-dask-chunk-sizes)). Also, the shape might be relevant,
+> > depending on the application! Here, we might select a chunks shape of `(1, 6144, 6144)`:
 > >
 > > ~~~
 > > band = rioxarray.open_rasterio(band_url, chunks=(1, 6144, 6144))
 > > ~~~
 > > {: .language-python}
 > >
-> > which leads to chunks 72 MB large. Also, we can let `rioxarray` and Dask figure out appropriate chunk shapes by
-> > setting `chunks="auto"`:
+> > which leads to chunks 72 MB large: (1 x 6144 x 6144) elements, 2 bytes per element (the data type is unsigned
+> > integer `uint16`), i.e., 6144 x 6144 x 2 / 2^20 = 72 MB . Also, we can let `rioxarray` and Dask figure out
+> > appropriate chunk shapes by setting `chunks="auto"`:
 > >
 > > ~~~
 > > band = rioxarray.open_rasterio(band_url, chunks="auto")