Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make API requests and response typesafe #4

Open
atombrenner opened this issue Feb 10, 2024 · 5 comments
Open

Make API requests and response typesafe #4

atombrenner opened this issue Feb 10, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@atombrenner
Copy link

Feature Request

This package does not really utilize TypeScript as the title of the package promises.

The parameters of the request are of type any, so the IDE can't help with autocompletion and
the compiler can't check if the parameters are valid. Also the response of the client seems to
be just the unprocessed flatbuffers object from @openmeteo/sdk without any processing.
This has a few problems:

  • user needs to understand and work with low level flatbuffers types and convert values
  • user must use indices to map requested parameters to returned data
  • all values are optional (nullable), so client code needs to check each value before using.
    In the examples this leads to lots of ! non-null assertions sprinkled over the code, which is
    typically an indicator for sub optimal typing.
  • IDE can't provide autocompletion on the response
  • compile can't check if you try to access wrong weather variables

I propose a solution, that abstracts away the flatbuffers objects and instead returns something that is
typed as a subset from the JSON response of the API. Converting is done internally, only "normal" JavaScript
types like number and string are used and you don't need to manage indices yourself:

const data = await fetchWeatherData({
    latitude: 49.0699764,
    longitude: 11.614277,
    timezone: 'Europe/Berlin',
    forecast_days: 3,
    daily: ['sunshine_duration', 'sunrise', 'sunset'],
  })

const sunrise = new Date(data.daily.sunrise[0])
console.log('sunrise day 1', sunrise)

I have created a PoC package at @atombrenner/openmeteo so you can play around with my idea a bit.

@atombrenner atombrenner added the enhancement New feature or request label Feb 10, 2024
@philheller
Copy link

I agree, all the advantages TS offers are not taken advantage of here thus far. Love your PoC @atombrenner !

@patrick-zippenfenig
Copy link
Member

Hi @atombrenner! Your library is certainly a nicer approach. I have no illusions that the provided SDK compiled FlatBuffer files and simple fetch library are fairly basic and a higher level abstraction can provide a better developer experience.

The FlatBuffers integration is barely 3 months old. I first wanted to observe the adaption and potential issues. Arguably FlatBuffers is overkill for a simple weather API, but with up to 80 years of historical data, it makes a difference in parsing time. One issue for example is the FlatBuffers structure is hard to understand.

I like the conversion step to Optional<T, TimeSeries<T>>. This makes it feel more like JSON.

One issue is the extendability with more variables. Because the API supports a never ending number of weather variables, it is hard to hardcode a list of all acceptable variables for each endpoint. With pressure level weather variables like temperature_500hPa the number is already around 1000. Technically, each hourly variable could then be get as daily variable with as min, max, mean aggregation. Right now I am working on a new feature to get the forecast that is 2 days old with variable names like temperature_2m_previous_day2. Listing "all" weather variables might not be feasible.

The FlatBuffers schema encodes this information as attributes. It might be a good idea to have a VariableWithValues.toString(). This should simply code and get rid of the index based variable association temperature2mMin: daily.variables(1)!.valuesArray()!,.

@atombrenner
Copy link
Author

@patrick-zippenfenig thanks for the kind response :-)
I found your package while trying to use the "so called API" from DWD Icon where you have to parse files on your own with a very unaccessible docu. So the open-meteo API is already incredibly useful for me.

I have a question to the ever-growing number of parameters. To be useful, each parameter must be documented.
I added all parameters from the open-meteo website. The enums in my package should ideally be created from the docs. Unfortunately the the structure of the website did not allow easy generating, e.g. multiple parameters documented together. If new parameters are added, someone needs to implement and document them, so I would believe the parameter growth rate should eventually slow down.

If new parameters are added we would just need a new version of the packages.
You could even use new parameters with old packages if you knew the name of the new parameter.

As you mentioned, the schema of the parameters seems to be optimizable, e.g. all those temperature variants. Maybe adding parameters for the requested aggregation or height. If there is a system behind the construction of the parameter names, maybe we could use string template literals to construct them automatically.

I am no expert in meteorology, as a hobbyist I am probably fine with a subset of those metrics. We could have standard parameters and advanced parameters, to give beginners an easy start.

@patrick-zippenfenig
Copy link
Member

To be useful, each parameter must be documented.
I added all parameters from the open-meteo website. The enums in my package should ideally be created from the docs.

Sure, adding more variables in future package versions is an option. The issue is that many variables like temperature_500hPa are available at ~60 different pressure levels. Multiplied by roughly 10 different pressure variables you already get 600 variables. Right now I am working on weather forecasts from previous days with variable names like temperature_2m_previous_day1, temperature_2m_previous_day2, ..._day3, ..._day4 and so on.

This is also the reason why the FlatBuffers response encodes attributes for pressure level or previous day separately:

table VariableWithValues {
  variable: Variable;
  unit: Unit;

  value: float; // Only used for current conditions
  values: [float]; // Contains a time series of data
  values_int64: [int64];  // Only for sunrise/set as a unix timestamp

  altitude: int16;
  aggregation: Aggregation;
  pressure_level: int16;
  depth: int16;
  depth_to: int16;
  ensemble_member: int16;
  previous_day: int16;
}

A solution could be to change how weather variables are requested. Dummy code:

let temperature_2m = OmVariable(variable: .temperature, altitude: 2)
let wind_speed_2m = OmVariable(variable: .wind_speed, altitude: 10)

let temperature_2m_max = OmVariable(variable: .temperature, altitude: 2, aggregation: .max)
let temperature_2m_min = OmVariable(variable: .temperature, altitude: 2, aggregation: .min)

// Translates `OmVariable` to a string `temperature_2m` (inefficient!)
let params = Params(
  latitude: 12, 
  longitude: 34, 
  hourly: [temperature_2m, wind_speed_2m], 
  daily: [temperature_2m_max, temperature_2m_min]
)
let results = api.fetch(url, params)

// `.get()` matches all attributes like variable, altitude, aggregation, pressure_levels and returns a potential match
let temperature_2m_data = results.hourly.get(temperature_2m)
let wind_speed_2m_data = results.hourly.get(wind_speed_2m)
let temperature_2m_max_data = results.daily.get(temperature_2m_max)
let temperature_2m_min_data = results.daily.get(temperature_2m_min)

// Note: the Ensemble API returns multiple temperature_2m results with different `ensemble_member` attributes from 0...51.
let temperature_2m_ensembles= results.hourly.getMultiple(temperature_2m)

/// Note: Technically different weather domains could be encoded as well, but requires larger changes to the API backend
let temperature_2m_gfs013 = OmVariable(variable: .temperature, altitude: 2, domain: .gfs013)
let temperature_2m_hrrr = OmVariable(variable: .temperature, altitude: 2, domain: .hrrr)

This could be more scalable. For this implementation a different API endpoint would be useful to accept the attribute-based variable definition directly instead of string parsing or encode the request using FlatBuffers. The benefit is, that all enumerations from the FlatBuffers definitions can be used directly. The drawback is, that it requires more logic for each supported programming language.

If I only consider the Swift programming language, I would use some kind of annotations:

struct MyWeatherData {
  @Variable(variable: .temperature, altitude: 2)
  let temperature2m: [Float]

  @Variable(variable: .precipitation)
  let preciptation: [Float]
}

let params = Params(
  latitude: 12, 
  longitude: 34, 
  hourlyClass: MyWeatherData.self
)

let results = api.fetch(url, params)

print(results.hourly.temperature2m)

I am not sure which is the right path to take forward. The functionality and variety of data of the API changed quite a lot over the last years. Keeping everything consistent is quite a hard job.

@atombrenner
Copy link
Author

I understand what you mean with variable count explosion. Just a few thoughts:

  • the underlying parameters in your example are temperature, pressure and height. Only those need to be documented. The combination causes the number explosion.
  • the length of an url is limited to a few KB, having thousands of long variable names in the same query will technically not be feasible
  • if you store such a huge number data points (variables), maybe a time-series database (e.g. InfluxDB or Prometheus) can give you some inspiration of how to query them
  • for complex use cases it's probably better to use a time-series query language instead of url query parameters
  • it's OK to have different endpoints for different use cases. E.g. separating historical data retrieval from forecast data, or maybe even have an hourly, daily endpoint.
  • Having a one-size-fits-all approach rarely makes everyone happy. A hobbyist user like myself is overloaded with complexity (e.g. I don't need pressure levels at all. An expert user doing research with historical data would be more happy with a powerful time-series query language which also allows various statistical aggregations.

I am quite happy with my approach of offering only a solution for a simple forecast use case. Thanks to the naming conventions I parsed the variable names from the website and excluded variables that I don't understand (collapsed parameters, especially temperature per pressure level ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants