Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataSquare.extendSquare can take advantage of loop reuse when building fillerExtendedRow and fillerRow #305

Open
odeke-em opened this issue Mar 5, 2024 · 1 comment

Comments

@odeke-em
Copy link

odeke-em commented Mar 5, 2024

While studying this code I noticed something that fillerRow is as long as ds.width+extendedWidth while fillerExtendedRow is as long as extendedWidth we could take advantage of loop reuse to build both fillerExtendedRow and fillerRow we can simply build them in the same loop and add a guard against the shorter length of extendedWidth

diff --git a/datasquare.go b/datasquare.go
index 0c16f59..5af113b 100644
--- a/datasquare.go
+++ b/datasquare.go
@@ -80,12 +80,12 @@ func (ds *dataSquare) extendSquare(extendedWidth uint, fillerChunk []byte) error
 	newSquareRow := make([][][]byte, newWidth)
 
 	fillerExtendedRow := make([][]byte, extendedWidth)
-	for i := uint(0); i < extendedWidth; i++ {
-		fillerExtendedRow[i] = fillerChunk
-	}
-
 	fillerRow := make([][]byte, newWidth)
 	for i := uint(0); i < newWidth; i++ {
+		if i < extendedWidth {
+			fillerExtendedRow[i] = fillerChunk
+		}
 		fillerRow[i] = fillerChunk
 	}

which produces interesting results

name                                          old time/op    new time/op    delta
Encoding/Leopard_128_shares_512-8               39.8µs ± 5%    39.0µs ± 3%     ~     (p=0.165 n=10+10)
Decoding/Leopard_128_shares_512-8                422ns ± 3%     424ns ± 4%     ~     (p=0.529 n=10+10)
EDSRoots/32x32x512_ODS-8                        5.92ms ± 4%    5.92ms ± 4%     ~     (p=0.853 n=10+10)
EDSRoots/64x64x512_ODS-8                        22.3ms ± 3%    22.2ms ± 2%     ~     (p=0.661 n=10+9)
EDSRoots/128x128x512_ODS-8                       106ms ±29%      87ms ± 4%  -17.51%  (p=0.022 n=9+10)
EDSRoots/256x256x512_ODS-8                       380ms ± 1%     350ms ± 3%   -7.90%  (p=0.000 n=8+9)
EDSRoots/512x512x512_ODS-8                       1.55s ±11%     1.35s ± 2%  -12.69%  (p=0.000 n=9+9)
Repair/Leopard_4x4x512_ODS-8                     370µs ± 6%     344µs ± 2%   -7.09%  (p=0.000 n=10+9)
Repair/Leopard_8x8x512_ODS-8                    1.37ms ± 3%    1.39ms ± 7%     ~     (p=0.315 n=9+10)
Repair/Leopard_16x16x512_ODS-8                  5.56ms ± 2%    5.43ms ± 1%   -2.23%  (p=0.000 n=10+10)
Repair/Leopard_32x32x512_ODS-8                  23.4ms ± 4%    22.3ms ± 4%   -4.90%  (p=0.001 n=9+9)
Repair/Leopard_64x64x512_ODS-8                  93.8ms ± 2%    89.5ms ± 2%   -4.60%  (p=0.000 n=9+10)
Repair/Leopard_128x128x512_ODS-8                 387ms ±16%     372ms ± 7%     ~     (p=0.912 n=10+10)
Repair/Leopard_256x256x512_ODS-8                 2.52s ± 5%     2.47s ± 2%     ~     (p=0.481 n=10+10)
Repair/Leopard_512x512x512_ODS-8                 8.93s ± 5%     8.66s ± 4%   -3.03%  (p=0.029 n=10+10)
ExtensionEncoding/Leopard_4x4x512_ODS-8         24.0µs ±10%    24.4µs ± 7%     ~     (p=0.190 n=10+10)
ExtensionEncoding/Leopard_8x8x512_ODS-8         61.9µs ±14%    59.2µs ± 4%     ~     (p=0.075 n=10+10)
ExtensionEncoding/Leopard_16x16x512_ODS-8        154µs ± 8%     148µs ± 3%   -4.23%  (p=0.010 n=10+9)
ExtensionEncoding/Leopard_32x32x512_ODS-8        474µs ± 5%     475µs ± 3%     ~     (p=0.400 n=9+10)
ExtensionEncoding/Leopard_64x64x512_ODS-8       1.74ms ± 9%    1.64ms ± 6%   -5.63%  (p=0.008 n=9+9)
ExtensionEncoding/Leopard_128x128x512_ODS-8     7.96ms ± 5%    7.04ms ± 1%  -11.56%  (p=0.000 n=7+10)
ExtensionEncoding/Leopard_256x256x512_ODS-8     40.6ms ±13%    36.9ms ±11%   -9.21%  (p=0.043 n=10+9)
ExtensionEncoding/Leopard_512x512x512_ODS-8      164ms ± 4%     164ms ± 5%     ~     (p=1.000 n=8+8)
ExtensionWithRoots/Leopard_4x4x512_ODS-8         171µs ± 6%     170µs ± 2%     ~     (p=0.604 n=10+9)
ExtensionWithRoots/Leopard_8x8x512_ODS-8         531µs ±10%     532µs ± 8%     ~     (p=0.739 n=10+10)
ExtensionWithRoots/Leopard_16x16x512_ODS-8      1.72ms ± 3%    1.79ms ±11%     ~     (p=0.133 n=10+9)
ExtensionWithRoots/Leopard_32x32x512_ODS-8      6.10ms ± 4%    7.32ms ±23%  +20.00%  (p=0.000 n=10+9)
ExtensionWithRoots/Leopard_64x64x512_ODS-8      23.6ms ± 2%    24.0ms ± 3%     ~     (p=0.136 n=9+9)
ExtensionWithRoots/Leopard_128x128x512_ODS-8    97.2ms ± 9%    96.4ms ± 8%     ~     (p=0.529 n=10+10)
ExtensionWithRoots/Leopard_256x256x512_ODS-8     459ms ±11%     399ms ± 5%  -13.06%  (p=0.000 n=10+9)
ExtensionWithRoots/Leopard_512x512x512_ODS-8     1.79s ± 4%     1.58s ± 4%  -12.04%  (p=0.000 n=9+8)

name                                          old alloc/op   new alloc/op   delta
Encoding/Leopard_128_shares_512-8                120kB ± 2%     118kB ± 2%   -1.58%  (p=0.043 n=10+10)
Decoding/Leopard_128_shares_512-8                0.00B          0.00B          ~     (all equal)
EDSRoots/32x32x512_ODS-8                        1.81MB ± 0%    1.81MB ± 0%     ~     (p=0.753 n=10+10)
EDSRoots/64x64x512_ODS-8                        6.24MB ± 0%    6.24MB ± 0%     ~     (p=1.000 n=9+10)
EDSRoots/128x128x512_ODS-8                      26.5MB ± 0%    26.5MB ± 0%     ~     (p=0.833 n=10+10)
EDSRoots/256x256x512_ODS-8                       109MB ± 0%     109MB ± 0%     ~     (p=0.435 n=9+9)
EDSRoots/512x512x512_ODS-8                       497MB ± 0%     497MB ± 0%     ~     (p=0.467 n=8+8)
Repair/Leopard_4x4x512_ODS-8                     163kB ± 1%     164kB ± 1%     ~     (p=0.190 n=10+10)
Repair/Leopard_8x8x512_ODS-8                     521kB ± 4%     526kB ± 4%     ~     (p=0.353 n=10+10)
Repair/Leopard_16x16x512_ODS-8                  1.89MB ± 9%    1.89MB ±10%     ~     (p=0.912 n=10+10)
Repair/Leopard_32x32x512_ODS-8                  7.68MB ± 5%    7.79MB ± 8%     ~     (p=0.280 n=10+10)
Repair/Leopard_64x64x512_ODS-8                  32.0MB ± 5%    33.5MB ± 7%   +4.51%  (p=0.009 n=10+10)
Repair/Leopard_128x128x512_ODS-8                 131MB ± 7%     133MB ±11%     ~     (p=0.280 n=10+10)
Repair/Leopard_256x256x512_ODS-8                 440MB ± 1%     440MB ± 0%     ~     (p=0.912 n=10+10)
Repair/Leopard_512x512x512_ODS-8                1.72GB ± 0%    1.72GB ± 0%     ~     (p=0.965 n=8+10)
ExtensionEncoding/Leopard_4x4x512_ODS-8         38.3kB ± 0%    38.3kB ± 0%     ~     (p=0.529 n=10+10)
ExtensionEncoding/Leopard_8x8x512_ODS-8          147kB ± 0%     147kB ± 1%     ~     (p=0.113 n=10+9)
ExtensionEncoding/Leopard_16x16x512_ODS-8        614kB ± 1%     612kB ± 0%     ~     (p=0.436 n=10+10)
ExtensionEncoding/Leopard_32x32x512_ODS-8       2.51MB ± 1%    2.52MB ± 1%     ~     (p=0.247 n=10+10)
ExtensionEncoding/Leopard_64x64x512_ODS-8       10.3MB ± 2%    10.2MB ± 1%     ~     (p=0.447 n=10+9)
ExtensionEncoding/Leopard_128x128x512_ODS-8     42.6MB ± 3%    42.9MB ± 3%     ~     (p=0.258 n=9+9)
ExtensionEncoding/Leopard_256x256x512_ODS-8      127MB ± 0%     127MB ± 0%     ~     (p=0.529 n=10+10)
ExtensionEncoding/Leopard_512x512x512_ODS-8      512MB ± 0%     512MB ± 0%     ~     (p=0.971 n=10+10)
ExtensionWithRoots/Leopard_4x4x512_ODS-8         119kB ± 0%     119kB ± 0%     ~     (p=0.404 n=10+10)
ExtensionWithRoots/Leopard_8x8x512_ODS-8         356kB ± 1%     357kB ± 1%     ~     (p=0.143 n=10+10)
ExtensionWithRoots/Leopard_16x16x512_ODS-8      1.20MB ± 2%    1.20MB ± 1%     ~     (p=0.475 n=10+7)
ExtensionWithRoots/Leopard_32x32x512_ODS-8      4.36MB ± 2%    4.34MB ± 2%     ~     (p=0.673 n=8+9)
ExtensionWithRoots/Leopard_64x64x512_ODS-8      16.5MB ± 5%    16.6MB ± 3%     ~     (p=0.604 n=10+9)
ExtensionWithRoots/Leopard_128x128x512_ODS-8    73.4MB ±19%    77.8MB ±16%     ~     (p=0.089 n=10+10)
ExtensionWithRoots/Leopard_256x256x512_ODS-8     236MB ± 0%     237MB ± 0%     ~     (p=0.280 n=10+10)
ExtensionWithRoots/Leopard_512x512x512_ODS-8    1.01GB ± 0%    1.01GB ± 0%     ~     (p=0.159 n=9+8)

name                                          old allocs/op  new allocs/op  delta
Encoding/Leopard_128_shares_512-8                  132 ± 0%       132 ± 0%     ~     (all equal)
Decoding/Leopard_128_shares_512-8                 0.00           0.00          ~     (all equal)
EDSRoots/32x32x512_ODS-8                         34.1k ± 0%     34.1k ± 0%     ~     (p=1.000 n=10+10)
EDSRoots/64x64x512_ODS-8                          134k ± 0%      134k ± 0%     ~     (p=0.550 n=10+10)
EDSRoots/128x128x512_ODS-8                        530k ± 0%      530k ± 0%     ~     (p=0.248 n=10+8)
EDSRoots/256x256x512_ODS-8                       2.11M ± 0%     2.11M ± 0%     ~     (p=0.480 n=10+10)
EDSRoots/512x512x512_ODS-8                       8.42M ± 0%     8.42M ± 0%     ~     (p=0.467 n=8+8)
Repair/Leopard_4x4x512_ODS-8                       772 ± 0%       772 ± 0%     ~     (all equal)
Repair/Leopard_8x8x512_ODS-8                     2.86k ± 0%     2.86k ± 0%     ~     (p=0.802 n=9+9)
Repair/Leopard_16x16x512_ODS-8                   10.9k ± 0%     10.9k ± 0%     ~     (p=0.698 n=10+10)
Repair/Leopard_32x32x512_ODS-8                   42.2k ± 0%     42.2k ± 0%     ~     (p=0.433 n=10+9)
Repair/Leopard_64x64x512_ODS-8                    166k ± 0%      166k ± 0%     ~     (p=0.985 n=10+10)
Repair/Leopard_128x128x512_ODS-8                  660k ± 0%      660k ± 0%     ~     (p=0.725 n=10+10)
Repair/Leopard_256x256x512_ODS-8                 2.64M ± 0%     2.63M ± 0%     ~     (p=0.853 n=10+10)
Repair/Leopard_512x512x512_ODS-8                 10.5M ± 0%     10.5M ± 0%     ~     (p=0.965 n=8+10)
ExtensionEncoding/Leopard_4x4x512_ODS-8            153 ± 0%       153 ± 0%     ~     (all equal)
ExtensionEncoding/Leopard_8x8x512_ODS-8            389 ± 0%       389 ± 0%     ~     (all equal)
ExtensionEncoding/Leopard_16x16x512_ODS-8        1.15k ± 0%     1.15k ± 0%     ~     (all equal)
ExtensionEncoding/Leopard_32x32x512_ODS-8        3.82k ± 0%     3.82k ± 0%     ~     (all equal)
ExtensionEncoding/Leopard_64x64x512_ODS-8        13.8k ± 0%     13.8k ± 0%     ~     (all equal)
ExtensionEncoding/Leopard_128x128x512_ODS-8      52.1k ± 0%     52.1k ± 0%     ~     (p=0.353 n=10+7)
ExtensionEncoding/Leopard_256x256x512_ODS-8       201k ± 0%      201k ± 0%     ~     (p=0.340 n=10+10)
ExtensionEncoding/Leopard_512x512x512_ODS-8       795k ± 0%      795k ± 0%     ~     (p=0.154 n=9+9)
ExtensionWithRoots/Leopard_4x4x512_ODS-8           830 ± 0%       830 ± 0%     ~     (all equal)
ExtensionWithRoots/Leopard_8x8x512_ODS-8         2.79k ± 0%     2.79k ± 0%     ~     (all equal)
ExtensionWithRoots/Leopard_16x16x512_ODS-8       10.1k ± 0%     10.1k ± 0%     ~     (all equal)
ExtensionWithRoots/Leopard_32x32x512_ODS-8       38.0k ± 0%     38.0k ± 0%     ~     (all equal)
ExtensionWithRoots/Leopard_64x64x512_ODS-8        148k ± 0%      148k ± 0%     ~     (all equal)
ExtensionWithRoots/Leopard_128x128x512_ODS-8      583k ± 0%      583k ± 0%     ~     (p=0.894 n=10+10)
ExtensionWithRoots/Leopard_256x256x512_ODS-8     2.31M ± 0%     2.31M ± 0%     ~     (p=0.617 n=10+9)
ExtensionWithRoots/Leopard_512x512x512_ODS-8     9.22M ± 0%     9.22M ± 0%     ~     (p=0.443 n=9+9)

/cc @elias-orijtech @liamsi @rootulp @musalbas @staheri14

@rootulp
Copy link
Collaborator

rootulp commented Mar 5, 2024

Agreed and it seem like the optimization you propose still works as expected.

The performance penalty comes at the cost of (in my subjective opinion) slightly less code readability so I think we should only implement this optimization if we need to.

I'm also puzzled by the few instances where benchmarking shows this optimization results in a positive delta. I.e:

ExtensionWithRoots/Leopard_32x32x512_ODS-8      6.10ms ± 4%    7.32ms ±23%  +20.00%  (p=0.000 n=10+9)

Repair/Leopard_64x64x512_ODS-8                  32.0MB ± 5%    33.5MB ± 7%   +4.51%  (p=0.009 n=10+10)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants