Fix/postprocessing pandas concat error #100

MaGering · 2024-02-28T14:10:25Z

Resolves

INFO - Restoring attributes will overwrite existing attributes.
Traceback (most recent call last):
  File "~/Schreibtisch/oemof-B3/scripts/postprocess.py", line 61, in <module>
    rdp = ResultsDataPackage.from_energysytem(es)
  File "~/anaconda3/envs/oemof-B3_desktop/lib/python3.10/site-packages/oemoflex/model/datapackage.py", line 320, in from_energysytem
    data, rel_paths = cls._get_results(cls, es)
  File "~/anaconda3/envs/oemof-B3_desktop/lib/python3.10/site-packages/oemoflex/model/datapackage.py", line 330, in _get_results
    data_scal, rel_paths_scal = self._get_scalars(self, es)
  File "~/anaconda3/envs/oemof-B3_desktop/lib/python3.10/site-packages/oemoflex/model/datapackage.py", line 405, in _get_scalars
    all_scalars = run_postprocessing(es)
  File "~/anaconda3/envs/oemof-B3_desktop/lib/python3.10/site-packages/oemoflex/model/postprocessing.py", line 809, in run_postprocessing
    all_scalars = map_var_names(all_scalars)
  File "~/anaconda3/envs/oemof-B3_desktop/lib/python3.10/site-packages/oemoflex/model/postprocessing.py", line 585, in map_var_names
    scalars.index = scalars.index.map(map_index)
  File "~/anaconda3/envs/oemof-B3_desktop/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 6491, in map
    new_values = self._map_values(mapper, na_action=na_action)
  File "~/anaconda3/envs/oemof-B3_desktop/lib/python3.10/site-packages/pandas/core/base.py", line 921, in _map_values
    return algorithms.map_array(arr, mapper, na_action=na_action, convert=convert)
  File "~/anaconda3/envs/oemof-B3_desktop/lib/python3.10/site-packages/pandas/core/algorithms.py", line 1743, in map_array
    return lib.map_infer(values, mapper, convert=convert)
  File "lib.pyx", line 2972, in pandas._libs.lib.map_infer
  File "~/anaconda3/envs/oemof-B3_desktop/lib/python3.10/site-packages/oemoflex/model/postprocessing.py", line 565, in map_index
    component_id = get_component_id(id)
  File "~/anaconda3/envs/oemof-B3_desktop/lib/python3.10/site-packages/oemoflex/model/postprocessing.py", line 522, in get_component_id
    component_id = get_component_id_in_tuple((id[0], id[1]))
  File "~/anaconda3/envs/oemof-B3_desktop/lib/python3.10/site-packages/oemoflex/model/postprocessing.py", line 170, in get_component_id_in_tuple
    return component_id
UnboundLocalError: local variable 'component_id' referenced before assignment

This errors originates here with pandas==2.2.1. With pandas==2.0.3 it still worked.

The two Series invested_capacity and invested_storage_capacity have a TimeStamp as column name and hence in 'name' which mixes up the index order in the Dataframe created with the concat function.

Replacing it with list(map(lambda series: series.rename('0', inplace=True), all_scalars)) did not make any difference and neither did

    timestamp_variable = pd.to_datetime("2017-01-01 00:00:00")
    all_scalars = pd.concat(all_scalars, axis=0, keys=['source', 'target', 'var_name', '0', 0, 'var_value', timestamp_variable])
    all_scalars = all_scalars.droplevel(0)

So for now, a workaround is implemented where the multi-index is written to a one-level/single-index and then retrieved from the single-index-string (with the multi-index in it) after concatenation.

henhuy

Simple request/note.
Rest LGTM

henhuy · 2024-02-28T14:26:46Z

oemoflex/model/postprocessing.py

+    # Todo: To be further investigated
+
+    # Index work-around - issues with concat and Multiindex
+    list(map(lambda series: series.rename("0", inplace=True), all_scalars))


This is not recommended, as you are abusing list generation to perform actions on a series (see i.e. https://stackoverflow.com/a/5753614/5804947).
Instead you could write:
all_scalars_renamed = [series.rename("0") for series in all_scalars]

or merge this directly with next call:
all_scalars_reindexed = [series.rename("0").reset_index() for series in all_scalars]

Thanks a lot! "any seasoned Pythonista will give you hell over it". Don't want to experience that :-D
Adapted with commit a31f733.

Unfortunately I get the following error with a31f733:

INFO - Traceback (most recent call last): File "~/Schreibtisch/oemof-B3/scripts/plot_scalar_results.py", line 524, in <module> scalars = load_scalar_results(scalars_path) File "~/Schreibtisch/oemof-B3/scripts/plot_scalar_results.py", line 502, in load_scalar_results df = dp.format_header( File "~/Schreibtisch/oemof-B3/oemof_b3/tools/data_processing.py", line 103, in format_header raise ValueError(f"There are extra columns {extra_colums}") ValueError: There are extra columns ['0']

It is fixed with commit 0fc2956, but the DataFrame all_scalars from the version before the pandas update is now different from the current dataframe. However, the results of the postprocessing seem to be the same. I have compared the scalars.csv files with each other.

henhuy

LGTM - as before I did not test it locally.

MaGering · 2024-03-01T10:13:40Z

LGTM - as before I did not test it locally.

Great thank you, just tested it in oemof-B3 (again) and in addition against the dev version. Results are same except for minimal changes such as -0 instead of 0 or numbers in the e^-13 range instead of 0, for example, which is probably caused by the new pandas version.

MaGering added 3 commits February 28, 2024 15:00

Remove broken code and add explanation what is broken

434fca3

Add code which fixes the concat problem as a work around

60c1d8b

Adapt to new variable in order to not overwrite list all_scalars

ff0f4c7

MaGering self-assigned this Feb 28, 2024

MaGering added the bug Something isn't working label Feb 28, 2024

MaGering added this to the v0.0.3 milestone Feb 28, 2024

MaGering requested a review from henhuy February 28, 2024 14:12

henhuy requested changes Feb 28, 2024

View reviewed changes

MaGering added 2 commits February 28, 2024 19:20

Rename Series via series.rename() instead of list comprehension

a31f733

Change name of all_scalars_df to var_value

0fc2956

MaGering requested a review from henhuy February 29, 2024 10:44

henhuy approved these changes Feb 29, 2024

View reviewed changes

MaGering merged commit 1c7e983 into dev Mar 1, 2024
2 checks passed

MaGering deleted the fix/postprocessing_pandas_concat_error branch March 1, 2024 10:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/postprocessing pandas concat error #100

Fix/postprocessing pandas concat error #100

MaGering commented Feb 28, 2024

henhuy left a comment

henhuy Feb 28, 2024

henhuy Feb 28, 2024

MaGering Feb 28, 2024

MaGering Feb 28, 2024

MaGering Feb 28, 2024

henhuy left a comment

MaGering commented Mar 1, 2024

Fix/postprocessing pandas concat error #100

Fix/postprocessing pandas concat error #100

Conversation

MaGering commented Feb 28, 2024

henhuy left a comment

Choose a reason for hiding this comment

henhuy Feb 28, 2024

Choose a reason for hiding this comment

henhuy Feb 28, 2024

Choose a reason for hiding this comment

MaGering Feb 28, 2024

Choose a reason for hiding this comment

MaGering Feb 28, 2024

Choose a reason for hiding this comment

MaGering Feb 28, 2024

Choose a reason for hiding this comment

henhuy left a comment

Choose a reason for hiding this comment

MaGering commented Mar 1, 2024