Implement expected analysis #236

paulf81 · 2024-11-06T19:06:30Z

Implement expected analysis

Feature or improvement description
This branch will implement the methods of uplift analysis described in AWC validation methodology. It also makes some changes to names of existing functions to better clarify the operation of the seperate methods for analyzing data and quantifying uplift.

Changes to be included:

flasc/analysis/expected_power_analysis.py

ejsimley · 2024-11-23T00:11:09Z

@paulf81, I was going to add a test to make sure the uplift variance is correct using _total_uplift_expected_power_with_standard_error but then saw a couple potential issues with the way uncertainty gets calculated.

First, if setting variance_only = True or fill_cov_with_var = True so we can still get an uncertainty estimate even when we have missing covariance terms, what happens when some variance terms are Null (because there is only one sample in a bin for the turbine)? Then it seems we still wouldn't be able to get a non-Null uplift variance. Should we remove bins where there is only one sample (or remove individual turbines in the bin from the rest of the analysis if there is only 1 sample)?

The other issue is when remove_any_null_turbine_bins = False, which is the default case. If a turbine is missing in a particular bin we can still compute the expected farm power in that bin, we just ignore the missing turbines. But when computing the farm power variance for that bin, I think the variance terms for all of the turbine pairs still get used. So the farm power variance will be Null for all of these cases, even though we intended to just not include the missing turbines for those bins. I think this could be addressed relatively easily when computing farm variance, but wanted to see what you think.

Also, let me know if I'm wrong about any of the above. I didn't get a chance to thoroughly check this behavior yet.

ejsimley · 2024-11-26T22:28:22Z

flasc/analysis/expected_power_analysis.py

+        test_cols=test_cols,
+        bin_cols_without_df_name=bin_cols_without_df_name,
+    )
+


I was thinking this could be a good place to call a new utility function that synchronizes nulls between df_cov and df_bin (something like "_synchronize_mean_power_cov_nulls"). Specifically, for each row, if there are any turbines in df_cov with undefined variances or covariances (because count < 2), then the mean power for those turbines would get set to Null as well. This way, we would always be able to return a standard error for the uplift by excluding turbines in a given bin with undefined covariance from both the expected uplift and uncertainty calculations. I like this because we wouldn't end up returning a NaN uncertainty value when calling this function. In most cases, though, we would probably just want to set variance_only = True or fill_cov_with_var = True to maximize the number of turbines that can be used to compute the uplift in each bin.

ejsimley · 2024-11-26T22:35:50Z

flasc/analysis/expected_power_analysis.py

+
+    # If any of the cov_cols are null, set pow_farm_var to null
+    df_bin = df_bin.with_columns(
+        pl.when(pl.all_horizontal([pl.col(c).is_not_null() for c in cov_cols]))


Connected to the comments above, I think we could get rid of this requirement that all cov_cols are defined. When computing the "pow_farm_var" column a few lines above, if my understanding is correct, the summation over cov_cols will just ignore any Null values. So, we'd be calculating the variance of the farm power considering only the turbines that are valid, in the same way that "pow_farm" will only sum the power of the turbines that are not Null. Therefore, the farm power variance will correspond to the same set of turbiens used to compute farm power.

ejsimley · 2024-11-26T22:43:41Z

flasc/analysis/expected_power_analysis.py

+    if remove_any_null_turbine_bins:
+        df_bin = df_bin.filter(
+            pl.all_horizontal([pl.col(f"{c}_mean").is_not_null() for c in test_cols])
+        )


Connected to the comment above, at this point, regardless of the value of "remove_any_null_turbine_bins," we might need to remove rows where all test_cols are Null. Although rare, I think it is possible that if count < 2 for all turbines in a bin, then after synchronizing Nulls between df_cov and df_bin, we could be left with rows that are all Null that should get filtered out from the analysis.

init expected power analysis

15eaff3

paulf81 added the enhancement An improvement of an existing feature label Nov 6, 2024

paulf81 requested review from misi9170 and ejsimley November 6, 2024 19:06

paulf81 assigned paulf81 and ejsimley Nov 6, 2024

paulf81 added 24 commits November 6, 2024 12:25

Rename energy ratio input to analysis input

59960b3

Update total uplift function name

bd0d792

rename total_uplift module

ec083b6

refactoring

7c9a502

Add output module

9655a1a

Add skeleton structure

4ac90ac

Add grouping function and test

8a6a295

Add initial work

b91712d

Updates

f9a83e5

Add covariance tests

d064255

Next steps

2987a55

v1 completion

e7d724b

fix mistakes in equations

a9bcf4e

start output object

bd05656

Complete and test public function

6a62889

Improve docstrings and type hints

94a0be7

formatting

e8688d5

Update ruff precommit

4af9093

Run pre-commit over files

8305f87

Split to utilities

16069ca

pre_commit

c1e9dbd

fix too long line

e89d387

missing docstring

cce9d78

fix long lines

e481d81

paulf81 added 5 commits November 15, 2024 15:03

Merge branch 'develop' into feature/expected_power

0294ec2

fix long line

7545706

Rough in full cov null

1017ed9

Start adding fill

1522fbe

Add options for filling or striking cov

aee52f7

ejsimley reviewed Nov 21, 2024

View reviewed changes

flasc/analysis/expected_power_analysis.py Show resolved Hide resolved

ejsimley reviewed Nov 21, 2024

View reviewed changes

flasc/analysis/expected_power_analysis.py Outdated Show resolved Hide resolved

paulf81 added 2 commits November 22, 2024 10:29

remove null std bins gone

d075534

add check on both variance_only and fill_cov_with_var

315ad86

paulf81 commented Nov 22, 2024

View reviewed changes

flasc/analysis/expected_power_analysis.py Outdated Show resolved Hide resolved

paulf81 added 2 commits November 22, 2024 10:54

remove refs in output

2a88411

Remove null option

7e7c5af

ejsimley reviewed Nov 26, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement expected analysis #236

Implement expected analysis #236

paulf81 commented Nov 6, 2024 •

edited

Loading

ejsimley commented Nov 23, 2024

ejsimley Nov 26, 2024

ejsimley Nov 26, 2024 •

edited

Loading

ejsimley Nov 26, 2024

Implement expected analysis #236

Are you sure you want to change the base?

Implement expected analysis #236

Conversation

paulf81 commented Nov 6, 2024 • edited Loading

Implement expected analysis

ejsimley commented Nov 23, 2024

ejsimley Nov 26, 2024

Choose a reason for hiding this comment

ejsimley Nov 26, 2024 • edited Loading

Choose a reason for hiding this comment

ejsimley Nov 26, 2024

Choose a reason for hiding this comment

paulf81 commented Nov 6, 2024 •

edited

Loading

ejsimley Nov 26, 2024 •

edited

Loading