-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/postprocessing pandas concat error #100
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simple request/note.
Rest LGTM
oemoflex/model/postprocessing.py
Outdated
# Todo: To be further investigated | ||
|
||
# Index work-around - issues with concat and Multiindex | ||
list(map(lambda series: series.rename("0", inplace=True), all_scalars)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not recommended, as you are abusing list generation to perform actions on a series (see i.e. https://stackoverflow.com/a/5753614/5804947).
Instead you could write:
all_scalars_renamed = [series.rename("0") for series in all_scalars]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or merge this directly with next call:
all_scalars_reindexed = [series.rename("0").reset_index() for series in all_scalars]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! "any seasoned Pythonista will give you hell over it". Don't want to experience that :-D
Adapted with commit a31f733.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I get the following error with a31f733:
INFO -
Traceback (most recent call last):
File "~/Schreibtisch/oemof-B3/scripts/plot_scalar_results.py", line 524, in <module>
scalars = load_scalar_results(scalars_path)
File "~/Schreibtisch/oemof-B3/scripts/plot_scalar_results.py", line 502, in load_scalar_results
df = dp.format_header(
File "~/Schreibtisch/oemof-B3/oemof_b3/tools/data_processing.py", line 103, in format_header
raise ValueError(f"There are extra columns {extra_colums}")
ValueError: There are extra columns ['0']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is fixed with commit 0fc2956, but the DataFrame all_scalars from the version before the pandas update is now different from the current dataframe. However, the results of the postprocessing seem to be the same. I have compared the scalars.csv files with each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - as before I did not test it locally.
Great thank you, just tested it in oemof-B3 (again) and in addition against the dev version. Results are same except for minimal changes such as -0 instead of 0 or numbers in the e^-13 range instead of 0, for example, which is probably caused by the new pandas version. |
Resolves
This errors originates here with pandas==2.2.1. With pandas==2.0.3 it still worked.
The two Series
invested_capacity
andinvested_storage_capacity
have a TimeStamp as column name and hence in 'name' which mixes up the index order in the Dataframe created with the concat function.Replacing it with
list(map(lambda series: series.rename('0', inplace=True), all_scalars))
did not make any difference and neither didSo for now, a workaround is implemented where the multi-index is written to a one-level/single-index and then retrieved from the single-index-string (with the multi-index in it) after concatenation.