Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expansion of weather stations included #59

Open
DLDonaldson opened this issue Dec 11, 2019 · 7 comments
Open

Expansion of weather stations included #59

DLDonaldson opened this issue Dec 11, 2019 · 7 comments

Comments

@DLDonaldson
Copy link

In exploring the functionality, it seems that the majority of the stations included are in the U.S. and Australia. It would be great if data for weather stations in other regions of the world could be included. I am particularly interested in the UK and other portions of Europe but it would be great to explore the possibilities of expansion to cover as much of the world as possible.

@ssuffian
Copy link
Contributor

ssuffian commented Dec 11, 2019

The weather data is sourced from NOAA (although the website is unfortunately currently down (ftp://ftp.ncdc.noaa.gov/pub/data/noaa)) which has access to stations globally. What we do is build a database that only contains certain weather stations (using the eeweather rebuild_db command. This command limits to US and Australian stations, as you can see here: https://github.com/openeemeter/eeweather/blob/master/eeweather/database.py#L186

The ``eeweather rebuild_db` command stores a sqlite3 db file that contains only the stations that were filtered by that line above. If you would like to expand the stations, you could play around with adjusting that line or possibly adding a cli parameter to the rebuild_db call in order to allow countries to be passed in as a parameter. It may be tough to do right now that NOAA is down, but it should hopefully be back up in the next few days. If you get it working, please consider submitting it as a pull request!

Let us know if you need any help navigating the code.

@philngo
Copy link
Contributor

philngo commented Dec 13, 2019

@DLDonaldson @ssuffian I did a quick experiment to see how big the database of metadata gets when it includes all of the weather stations in the world. It looks like it increases in size from 11.7Mb to 25Mb, which is actually probably reasonable, and which is still well below the PyPI package size limit, which would be our upper bound. It's a little bit big for a python package, and we could probably do work to slim it down a bit or separate it out from the library itself, but I think I could be convinced to move to world-wide support. When I get a chance I will create a branch or tag with that change so that we can test it out in practice.

@bhough199
Copy link

@philngo I'm trying to expand the list of weather stations to include NZ and Canada (worldwide as mentioned in this issue would be great but I wanted to just start with what I need). I managed to get it almost working with the following changes:

  1. delete eeweather/eeweather/resources/metadata.db
  2. Change database.py as shown in this diff:
diff --git a/eeweather/database.py b/eeweather/database.py
index 8466f9f..68e406b 100644
--- a/eeweather/database.py
+++ b/eeweather/database.py
@@ -181,9 +181,11 @@ def _load_isd_station_metadata(download_path):
     )
 
     isAus = isd_history.CTRY == "AS"
+    isCan = isd_history.CTRY == "CA"
+    isNZ = isd_history.CTRY == "NZ"
 
     metadata = {}
-    for usaf_station, group in isd_history[hasGEO & hasUSAF & (isUS | isAus)].groupby("USAF"):
+    for usaf_station, group in isd_history[hasGEO & hasUSAF & (isUS | isAus | isCan | isNZ)].groupby("USAF"):
         # find most recent
         recent = group.loc[group.END.idxmax()]
         wban_stations = list(group.WBAN)
  1. Find a new source for the CA_Building_Standards_Climate_Zones.zip file, which is missing from the official ca.gov site. I know the place I found it probably isn't a long term solution to this problem, but I couldn't rebuild the database without this file.
diff --git a/scripts/create_ca_climate_zone_geojson.sh b/scripts/create_ca_climate_zone_geojson.sh
index 145711f..732408b 100755
--- a/scripts/create_ca_climate_zone_geojson.sh
+++ b/scripts/create_ca_climate_zone_geojson.sh
@@ -5,7 +5,7 @@ DATA_DIR=${1:-data}
 mkdir -p $DATA_DIR
 
 # download and install CA climate zone raw data
-wget -N http://ww2.energy.ca.gov/maps/renewable/CA_Building_Standards_Climate_Zones.zip -P $DATA_DIR -q --show-progress
+wget -N https://community.esri.com/servlet/JiveServlet/download/176380-1-158805/CA_Building_Standards_Climate_Zones.zip -P $DATA_DIR -q --show-progress
 unzip -q -o $DATA_DIR/CA_Building_Standards_Climate_Zones.zip -d $DATA_DIR
 
 # reproject to ESRI Shapefile
  1. After those changes I ran the eeweather rebuild-db command from inside the shell of my docker image and it worked. I am able to get weather stations and weather data for NZ and CA.

In doing all these changes I somehow broke the ability to use the is_tmy3=True parameter in eeweather.rank_stations() anymore (even when I am looking for stations in the US). If I pass is_tmy3=True the response is (None, [EEWeatherWarning(qualified_name=eeweather.no_weather_station_selected)]) regardless of where I look for a weather station. Is there some step that I overlooked when rebuilding the database that might have broken this?

@DLDonaldson
Copy link
Author

DLDonaldson commented Nov 12, 2020

@philngo I did some work earlier this year to slim down the number of stations worldwide based on the duration of the history and the amount of data available for each station. That might be a good way to reduce the overall number of stations in moving to worldwide support if you want to filter it down somewhat. If we were to expand the worldwide coverage that might simultaneously address the issue raised by @bhough199.

Also perhaps the TMY3 problem is a result of #63.

@philngo
Copy link
Contributor

philngo commented Nov 13, 2020

@DLDonaldson Worldwide coverage is definitely something I am interested in pursuing, but I'll need some support to move it forward. I had considered at one point making a download step that downloads the whole database, or which ever part of the database that was necessary for your task - which would decouple it from the PyPI release schedule. Filtering things down also seems like a pretty reasonable approach.

@bhough199 Thanks for sharing what you did to get the rebuilding working again - that will help other power users figure out how to rebuild from source. There was a step in the database building which I think scraped the old NREL site for the TMY3 station metadata, it's possible that that is also now broken. Let us know if #65 fixes your issue.

@bhough199
Copy link

@philngo I tested with the newest release after you fixed #65 but I am unfortunately still getting the same problem with eeweather.rank_stations().

Also in case anyone else is trying my approach, the link to the CA_Building_Standards_Climate_Zones.zip file that I showed in my earlier comment has changed (I knew it wasn't a reliable link, and suggest maybe you should host this file in the same place you put the TMY3 weather data since it is not available from the official source anymore?). The new link I found today is https://community.esri.com/ccqpr47374/attachments/ccqpr47374/coordinate-reference-systemsforum-board/1814/1/CA_Building_Standards_Climate_Zones.zip

@philngo
Copy link
Contributor

philngo commented Nov 18, 2020

@bhough199 Would you mind opening a new issue for this out-of-date source problem so we can track that separately from weather station expansion? I think it is a good idea to host our own version of the source files to prevent against this happening again - perhaps we can track that work in that new issue. Would appreciate help tracking down any other sources that are out of date, if you find any. One of these that may help solve the current rank_stations issue is this one (untested) which I believe you should be able to use from archive.org in the rebuilding step: http://web.archive.org/web/20181119091712/https://rredc.nrel.gov/solar/old_data/nsrdb/1991-2005/tmy3/by_USAFN.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants