Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated GADM_IDs appeared in gadm_shapes #1094

Open
2 tasks done
CUIJING03 opened this issue Sep 3, 2024 · 2 comments
Open
2 tasks done

Duplicated GADM_IDs appeared in gadm_shapes #1094

CUIJING03 opened this issue Sep 3, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@CUIJING03
Copy link

CUIJING03 commented Sep 3, 2024

Checklist

  • I am using the current main branch or the latest release. Please indicate.

  • I am running on an up-to-date pypsa-earth environment. Update via conda env update -f envs/environment.yaml.

Describe the Bug

When I updated to version 0.4.0, I found that the rule build_bus_regions does not work. The associated logs indicated the error as ValueError: All arrays must be of the same length. after debugging build_bus_regions.py I found that there were multiple duplicate GADM_IDs appearing (as shown here). One additional point I would like to add, I used alternative_clustering.

The first figure shows the specifics of a pair of duplicate GADM_IDs, and the second figure embodies all the duplicate GADM_IDs
gadm_shapes
307c71792de240045ba557debe2b311

@CUIJING03 CUIJING03 added the bug Something isn't working label Sep 3, 2024
@CUIJING03
Copy link
Author

My build_bus_regions log is as follows:

INFO:pypsa.io:Imported network base.nc has buses, lines, links, transformers
ERROR:_helpers:An error happened in module 'D:\\Anaconda3\\envs\\pypsa-earth\\lib\\site-packages\\pandas\\core\\internals\\construction.py', function '_extract_index': All arrays must be of the same length
Traceback (most recent call last):
  File "D:\pypsa-earth-project\pypsa-earth\.snakemake\scripts\tmpmtwwtd5g.build_bus_regions.py", line 205, in <module>
    temp_region = gpd.GeoDataFrame(
  File "D:\Anaconda3\envs\pypsa-earth\lib\site-packages\geopandas\geodataframe.py", line 139, in __init__
    super().__init__(data, *args, **kwargs)
  File "D:\Anaconda3\envs\pypsa-earth\lib\site-packages\pandas\core\frame.py", line 778, in __init__
    mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
  File "D:\Anaconda3\envs\pypsa-earth\lib\site-packages\pandas\core\internals\construction.py", line 503, in dict_to_mgr
    return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)
  File "D:\Anaconda3\envs\pypsa-earth\lib\site-packages\pandas\core\internals\construction.py", line 114, in arrays_to_mgr
    index = _extract_index(arrays)
  File "D:\Anaconda3\envs\pypsa-earth\lib\site-packages\pandas\core\internals\construction.py", line 677, in _extract_index
    raise ValueError("All arrays must be of the same length")
ValueError: All arrays must be of the same length

@davide-f
Copy link
Member

davide-f commented Sep 3, 2024

Great spot @CUIJING03 !
You may check if this code works:

# renaming 3 letter to 2 letter ISO code before saving GADM file
# In the case of a contested territory in the form 'Z00.00_0', save 'AA.00_0'
# Include bugfix for the case of 'XXX00_0' where the "." is missing, such as for Ghana
df_gadm["GADM_ID"] = df_gadm["country"] + df_gadm["GADM_ID"].str[3:].apply(
lambda x: x if x.find(".") == 0 else "." + x
)

Moreover if you open a PR about it, it would be great if you could move this fix into the function filder_gadm

def filter_gadm(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants