Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjustments to SOIT implementation #96

Merged
merged 6 commits into from
Jan 23, 2024
Merged

Adjustments to SOIT implementation #96

merged 6 commits into from
Jan 23, 2024

Conversation

danielmwatkins
Copy link
Collaborator

I came across some issues trying to follow the steps in the README for running things locally. Along with an issue in the README with an extra space in a pathname, I was reminded of the confusing naming convention for x/y and lat/lon and the issue that SOIT is forcing lat/lon to be integers. I couldn't find any reason that lat/lon should be integers -- all it does is add more error to the calculations -- so I changed "int" to "float" and I changed the references to x and y to what they really meant.

@@ -20,14 +20,14 @@ To run SOIT manually :
- [ ] ` export SPACEPSWD=<password>`
3. Update inputs and run the following:
```bash
docker run --env SPACEUSER --env SPACEPSWD --mount type=bind,source=<your_desired_output_dir>,target=/tmp brownccv/ \
icefloetracker-fetchdata:main \
docker run --env SPACEUSER --env SPACEPSWD --mount type=bind,source=<your_desired_output_dir>,target=/tmp brownccv/icefloetracker-fetchdata:main \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra space in line 23 was giving me an uninformative error from docker "docker: invalid reference format". Deleting the space between brownccv/ and icefloetracker-fetchdata:main made it so it would run.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find!

README.md Outdated
python3 /usr/local/bin/pass_time_cylc.py --startdate <YYYY-MM-DD> --enddate <YYYY-MM-DD> --csvoutpath /tmp --centroid_x <input_centroid_x> --centroid_y $<input_centroid_y> --SPACEUSER $SPACEUSER --SPACEPSWD $SPACEPSWD
```
* be sure to replace `source`, `startdate`, `enddate`, `centroid_x`, and `centroid_y` with your desired inputs
* csvoutpath must remain as `/tmp` to bind the Docker container output path with your desired local path

**Note:** The `pass_time_cylc.py` script in this project can be adapted to include additional satellites available in the [space-track.org](https://www.space-track.org/) repository.
**Note:** The `pass_time_cylc.py` script in this project can be adapted to include additional satellites available in the [space-track.org](https://www.space-track.org/) repository. If you have `numpy` and `skyfield` installed in a local `conda` environment, you can run `pass_time_cylc.py` from the directory where you installed the Ice Floe Tracker Pipeline:
```python3 workflow/scripts/pass_time_cylc.py --startdate <YYYY-MM-DD> --enddate <YYYY-MM-DD> --csvoutpath /tmp --centroid_x <input_centroid_x> --centroid_y $<input_centroid_y> --SPACEUSER $SPACEUSER --SPACEPSWD $SPACEPSWD```
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the advantage of running pass_time_cylc.py using a remote docker image rather than adding numpy and skyfield to the conda ift-env? To me it seems like it limits the flexibility of the script.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had issues with Conda collisions when using a local Conda because Julia already has an embedded Conda env. The Docker approach is more flexible because it works regardless of OS or config for the user. It should be possible to just load those deps in an environment and run the script locally, but difficult to support and maintain because of the different dependency resolutions between Windows, Mac, Linux. Any user can run it from the Docker container without worrying about resolving a Conda env.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. There is the need then to have docker running and to install docker, so in both cases there's installations to take care of.

):
centroidx = centroid_x
centroidy = centroid_y
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for changing variable names centroid_x to centroidx to lat in the original script? Why not use lat all the way through? For now I switched centroid_x to centroid_lat to make it clear that providing an x value would break things.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

centroid_x and centroid_y are the variable names used in the Cylc pipeline so I think I just had reset them during dev to keep track of the inputs. We'll probably just have to update the Cylc templates after this PR is merged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I thought about updating those but didn't want to break the Docker stuff.

Copy link

codecov bot commented Jan 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (af4e10b) 94.88% compared to head (55b0c41) 94.88%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #96   +/-   ##
=======================================
  Coverage   94.88%   94.88%           
=======================================
  Files           7        7           
  Lines         215      215           
=======================================
  Hits          204      204           
  Misses         11       11           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

configUsr = SPACEUSER
configPwd = SPACEPSWD
siteCred = {"identity": configUsr, "password": configPwd}
end_date = datetime.datetime.strptime(enddate, "%Y-%m-%d").strftime("%m-%d-%Y").split("-")
start_date = datetime.datetime.strptime(startdate, "%Y-%m-%d").strftime("%m-%d-%Y").split("-")
lat = int(centroidx)
long = int(centroidy)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change in variable names is just personal preference. I think long looks weird. The use of int is more troubling. Not only do we force the loss of accuracy on the order of 100 km, casting to int means we're not even going for the nearest integer. The wgs84.latlon function doesn't require integers -- the example on the skyfield API has decimal degrees, and the header for pass_time_cylc.py says that decimal degrees are accepted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember if that's just how the original script was or if there was a reason we had to use int(). Did you test it with float() just to verify that the script runs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did, it ran perfectly fine with float lat/lon on my machine

Copy link
Collaborator

@tdivoll tdivoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good cleanups! I just had some small suggestions and for running the script we recommend using the Docker container (for reducing future headaches), but it should be possible to run the script locally if the necessary software are installed.

@@ -20,14 +20,14 @@ To run SOIT manually :
- [ ] ` export SPACEPSWD=<password>`
3. Update inputs and run the following:
```bash
docker run --env SPACEUSER --env SPACEPSWD --mount type=bind,source=<your_desired_output_dir>,target=/tmp brownccv/ \
icefloetracker-fetchdata:main \
docker run --env SPACEUSER --env SPACEPSWD --mount type=bind,source=<your_desired_output_dir>,target=/tmp brownccv/icefloetracker-fetchdata:main \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find!

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated
python3 /usr/local/bin/pass_time_cylc.py --startdate <YYYY-MM-DD> --enddate <YYYY-MM-DD> --csvoutpath /tmp --centroid_x <input_centroid_x> --centroid_y $<input_centroid_y> --SPACEUSER $SPACEUSER --SPACEPSWD $SPACEPSWD
```
* be sure to replace `source`, `startdate`, `enddate`, `centroid_x`, and `centroid_y` with your desired inputs
* csvoutpath must remain as `/tmp` to bind the Docker container output path with your desired local path

**Note:** The `pass_time_cylc.py` script in this project can be adapted to include additional satellites available in the [space-track.org](https://www.space-track.org/) repository.
**Note:** The `pass_time_cylc.py` script in this project can be adapted to include additional satellites available in the [space-track.org](https://www.space-track.org/) repository. If you have `numpy` and `skyfield` installed in a local `conda` environment, you can run `pass_time_cylc.py` from the directory where you installed the Ice Floe Tracker Pipeline:
```python3 workflow/scripts/pass_time_cylc.py --startdate <YYYY-MM-DD> --enddate <YYYY-MM-DD> --csvoutpath /tmp --centroid_x <input_centroid_x> --centroid_y $<input_centroid_y> --SPACEUSER $SPACEUSER --SPACEPSWD $SPACEPSWD```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had issues with Conda collisions when using a local Conda because Julia already has an embedded Conda env. The Docker approach is more flexible because it works regardless of OS or config for the user. It should be possible to just load those deps in an environment and run the script locally, but difficult to support and maintain because of the different dependency resolutions between Windows, Mac, Linux. Any user can run it from the Docker container without worrying about resolving a Conda env.

README.md Outdated
python3 /usr/local/bin/pass_time_cylc.py --startdate <YYYY-MM-DD> --enddate <YYYY-MM-DD> --csvoutpath /tmp --centroid_x <input_centroid_x> --centroid_y $<input_centroid_y> --SPACEUSER $SPACEUSER --SPACEPSWD $SPACEPSWD
```
* be sure to replace `source`, `startdate`, `enddate`, `centroid_x`, and `centroid_y` with your desired inputs
* csvoutpath must remain as `/tmp` to bind the Docker container output path with your desired local path

**Note:** The `pass_time_cylc.py` script in this project can be adapted to include additional satellites available in the [space-track.org](https://www.space-track.org/) repository.
**Note:** The `pass_time_cylc.py` script in this project can be adapted to include additional satellites available in the [space-track.org](https://www.space-track.org/) repository. If you have `numpy` and `skyfield` installed in a local `conda` environment, you can run `pass_time_cylc.py` from the directory where you installed the Ice Floe Tracker Pipeline:
```python3 workflow/scripts/pass_time_cylc.py --startdate <YYYY-MM-DD> --enddate <YYYY-MM-DD> --csvoutpath /tmp --centroid_x <input_centroid_x> --centroid_y $<input_centroid_y> --SPACEUSER $SPACEUSER --SPACEPSWD $SPACEPSWD```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change the csvoutpath as well? If running locally, a user would need to know how to find /tmp, the results would not copy over to the resources dir of the repo like it does when using the Cylc flow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, just changed that

):
centroidx = centroid_x
centroidy = centroid_y
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

centroid_x and centroid_y are the variable names used in the Cylc pipeline so I think I just had reset them during dev to keep track of the inputs. We'll probably just have to update the Cylc templates after this PR is merged.

configUsr = SPACEUSER
configPwd = SPACEPSWD
siteCred = {"identity": configUsr, "password": configPwd}
end_date = datetime.datetime.strptime(enddate, "%Y-%m-%d").strftime("%m-%d-%Y").split("-")
start_date = datetime.datetime.strptime(startdate, "%Y-%m-%d").strftime("%m-%d-%Y").split("-")
lat = int(centroidx)
long = int(centroidy)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember if that's just how the original script was or if there was a reason we had to use int(). Did you test it with float() just to verify that the script runs?

workflow/scripts/pass_time_cylc.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@tdivoll tdivoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like everything here is updated.

@danielmwatkins danielmwatkins merged commit afa9717 into main Jan 23, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants