Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add storage_units to CSV export #562

Closed
nesnoj opened this issue Sep 12, 2024 · 3 comments · Fixed by #565
Closed

Add storage_units to CSV export #562

nesnoj opened this issue Sep 12, 2024 · 3 comments · Fixed by #565
Assignees
Labels
🚀 feature New feature or request

Comments

@nesnoj
Copy link
Collaborator

nesnoj commented Sep 12, 2024

Description of the issue

The DB table storage_units cannot be exported using db.to_csv() #560

Steps to Reproduce

  • db.to_csv("storages") does not include units
  • db.to_csv("storage_units") raises
ValueError: Allowed values for parameter data with API method are ['wind', 'solar', 'biomass', 'hydro', 'gsgk', 'combustion', 'nuclear', 'storage'] or ['balancing_area', 'electricity_consumer', 'gas_consumer', 'gas_producer', 'gas_storage', 'gas_storage_extended', 'grid_connections', 'grids', 'locations_extended', 'market_actors', 'market_roles', 'permit', 'deleted_units', 'retrofit_units', 'changed_dso_assignment']

Ideas of solution

Add to export function
Joins are implemented here

@nesnoj nesnoj added the 🐛 bug Something isn't working label Sep 12, 2024
@nesnoj nesnoj self-assigned this Sep 12, 2024
@nesnoj nesnoj added 🚀 feature New feature or request and removed 🐛 bug Something isn't working labels Sep 12, 2024
@nesnoj
Copy link
Collaborator Author

nesnoj commented Sep 17, 2024

I dug a little - my findings:

  1. storage_units (GetStromSpeicher / anlagenstromspeicher / StorageUnits)
    Column NutzbareSpeicherkapazitaet included, filled, table cannot be exported to CSV
  2. storage_extended (GetEinheitStromSpeicher / einheitenstromspeicher / StorageExtended)
    Column NutzbareSpeicherkapazitaet included, empty, table can be exported to CSV

How to proceed? Options:

  1. Automatically join storage_units on storage_extended to get this parameter. The problem is, that the field storage_units.VerknuepfteEinheit sometimes holds multiple mastr ids which would either result in data loss or multiple lines in storage_extended (one for each line). I recommend let the user decide how to join manually.
  2. Export storage_units when calling db.to_csv("storage") . A proper solution would be costly: The current export set involves ["unit_data", "eeg_data", "kwk_data", "permit_data"] (in this case, "unit_data" comes from storage_extended) and all of these 4 are joined into one table. So either we run into the join problem described in 1. or have to add extra stuff for export to a separate file.
  3. Export storage_units when calling db.to_csv("storage_units") -> add to ADDITIONAL_TABLES and ORM_MAP which I tested and it works as expected. I'd also add some instructions to the docs how to join (some further ideas in Collect features for new postprocessing #411).

I tend to implement 3.
What do you think @FlorianK13?

@FlorianK13
Copy link
Member

This raises the question of how the table structure is updated @FlorianK13 ? Is this done manually? The last update according to WSDL patchnotes has taken place in April?

I guess you talk about updating the orm.py file from here, right? This is updated manually. We could think about how to make that automated. A few years ago the problem was that the data model from bulk download and from soap API were not the same. Some rows only exist in one source, some column names have spelling mistakes in one of the sources...

How to proceed? Options:

I'm fine with 3. The proper solution from 2 would be great of course, but we also keep our own resources in mind. I have no use case for such a storages table right now, so I cannot spend time to implement this. I think for you @nesnoj this is the same case.

@nesnoj
Copy link
Collaborator Author

nesnoj commented Sep 17, 2024

This raises the question of how the table structure is updated @FlorianK13 ? Is this done manually? The last update according to WSDL patchnotes has taken place in April?

I guess you talk about updating the orm.py file from here, right? This is updated manually. We could think about how to make that automated. A few years ago the problem was that the data model from bulk download and from soap API were not the same. Some rows only exist in one source, some column names have spelling mistakes in one of the sources...

Thank you. I think as soon as the SOAP API is dropped, we can talk about automation. New issue: #563

How to proceed? Options:

I'm fine with 3. The proper solution from 2 would be great of course, but we also keep our own resources in mind. I have no use case for such a storages table right now, so I cannot spend time to implement this. I think for you @nesnoj this is the same case.

👍 Sounds good, I'll prepare sth

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🚀 feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants