Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeepMC - Inconsistency in Formatting of Historical Observation Data #181

Closed
321zyx opened this issue Jul 24, 2024 · 3 comments
Closed

DeepMC - Inconsistency in Formatting of Historical Observation Data #181

321zyx opened this issue Jul 24, 2024 · 3 comments
Assignees
Labels
documentation Improvements or additions to documentation question Information requested from user

Comments

@321zyx
Copy link

321zyx commented Jul 24, 2024

Topic

Documentation

Ask away!

In the information provided regarding the DeepMC notebook - https://github.com/microsoft/farmvibes-ai/blob/main/notebooks/deepmc/mc_forecast.ipynb - the explanatory text says that the fxx field should be in the format [start, stop & step], which for the example of observations recorded every hour would be "[0, 24, 1]." But the code example below says that the format is "fxx = [frequency_hour, number_of_hours + frequency_hour, 1]." Which, the text below indicates, would be in the same example, would be "fxx: [1, 25, 1] # start, stop, step." Those numbers don't make sense as a start and stop time measured in hours after the start of the day and, even if the first measurement at midnight is given the value 1, the last measurement of the day at 23:00 would be 24, not 25, unless the midnight observation was double counted in both the preceding and following days. I'm not even sure if "frequency_hour" refers to a true frequency (measurements/hour) or is the time in hours between measurements, which is inversely proportional to the frequency, but does identify the frequency. In the example given of one measurement per hour, both of those numbers would be the same.

It makes little sense to me that the field would be "[frequency_hour, number_of_hours + frequency_hour, 1]" as the second number would always be 24 plus the first number and the last is always 1. Only the first value would convey any information. But it doesn't really need to make sense to me. What I need to know is what values to use for data recorded every 5 minutes, 288 times a day. [0, 24, 0.08333] (start time, end time, and time in hours between measurements)? [12, 36, 1] (measurements/hour, number of hours in a day + measurements/hour, 1)? [0, 288, .0833] (time of first measurement of day, number of measurements per day, and time in hours between measurements)? [.08333, 24.08333, 1] (hours between measurements, 24+hours between measurements, 1)? [1, 289, 08333] (first measurement of the day defined as 1, number of last measurement of the day, hours between measurements)? The first and last of those seem most sensible to me, but what do I know?

@321zyx 321zyx added the question Information requested from user label Jul 24, 2024
@github-actions github-actions bot added documentation Improvements or additions to documentation triage Issues still not triaged by team labels Jul 24, 2024
@v-ngangarapu
Copy link
Collaborator

Hi,
Thank you for your observations. The fxx parameter defined in the notebook is used as an input to download HRRR data via the Herbie Python package, a third-party tool. The fxx parameter requirements are derived from Herbie. This parameter helps define the start hour, end hour, and frequency within a 24-hour period.

For example, to download data from 7/25/2024 13:00 to 7/26/2024 13:00, fxx would be [13, 13+24, 1]. In the background, this applies a range(13, 13+24, 1), resulting in [13, 14, ..., 36, 37]. Based on my understanding, whenever the fxx value exceeds 24, the Herbie package considers it as downloading data for the next day, starting from 7/26/2024 00:00:00.

Use this url for exploring more on herbie package.
https://herbie.readthedocs.io/en/stable/user_guide/tutorial/fast.html

Thanks

@rafaspadilha rafaspadilha removed the triage Issues still not triaged by team label Jul 30, 2024
@321zyx
Copy link
Author

321zyx commented Jul 31, 2024

Thanks. I'll look there and see if I can make sense out of it all. I appreciate the help.

@rafaspadilha
Copy link
Contributor

Closing this issue for now. Feel free to reopen if you have any other doubt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation question Information requested from user
Projects
None yet
Development

No branches or pull requests

3 participants