Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing continuous integration #14

Open
vincentvanhees opened this issue Nov 10, 2021 · 4 comments
Open

Implementing continuous integration #14

vincentvanhees opened this issue Nov 10, 2021 · 4 comments

Comments

@vincentvanhees
Copy link
Collaborator

vincentvanhees commented Nov 10, 2021

Hi @TheTS, as discussed I would like to help implement continuous integration for palmsplusr. This will, once implemented, automatically check whether changes to the code break existing functionality. In other words it will make contributing to palmsplusr more attractive and less stressful.

I have just installed palmsplusr and tried to run the package tests and R package checks, and noticed some issues. To get started with the continuous integration it would be good to first have a version of the code where all tests and checks pass. Here is a list of the issues I ran into:

  1. In DESCRIPTION file: License: LGPL-3 needs to be License: LGPL-3 | file LICENSE
  2. Many of the functions in functions.R have undeclared variables, which R check doesn't like.
For example:
palms_add_domain: no visible binding for '<<-' assignment to
     ‘palmsplus_domains’
   palms_add_domain: no visible binding for global variable
     ‘palmsplus_domains’
   palms_add_field: no visible binding for '<<-' assignment to
     ‘palmsplus_fields’
   palms_add_field: no visible binding for global variable
     ‘palmsplus_fields’
   palms_add_multimodal_field: no visible binding for '<<-' assignment to
     ‘multimodal_fields’
   palms_add_multimodal_field: no visible binding for global variable
     ‘multimodal_fields’
   palms_add_trajectory_field: no visible binding for '<<-' assignment to
     ‘trajectory_fields’
   palms_add_trajectory_field: no visible binding for global variable
     ‘trajectory_fields’
   palms_add_trajectory_location: no visible binding for '<<-' assignment
     to ‘trajectory_locations’
   palms_add_trajectory_location: no visible binding for global variable
     ‘trajectory_locations’
   palms_build_days: no visible binding for global variable
     ‘palmsplus_domains’
   palms_build_days: no visible binding for global variable
     ‘palmsplus_fields’
   palms_build_days: no visible binding for global variable ‘domain_field’
   palms_build_days: no visible binding for global variable ‘name’
   palms_build_days: no visible binding for global variable ‘identifier’
   palms_build_days: no visible binding for global variable ‘datetime’
   palms_build_days: no visible binding for global variable ‘.’
   palms_build_days: no visible global function definition for ‘UQ’
   palms_build_days: no visible binding for global variable ‘duration’
   palms_build_multimodal: no visible binding for global variable
     ‘identifier’
   palms_build_multimodal: no visible binding for global variable
     ‘tripnumber’
   palms_build_multimodal: no visible binding for global variable
     ‘geometry’
   palms_build_multimodal: no visible binding for global variable
     ‘end_point’
   palms_build_multimodal: no visible binding for global variable
     ‘start_point’
   palms_build_multimodal: no visible binding for global variable
     ‘end_prev’
   palms_build_multimodal: no visible binding for global variable
     ‘distance_diff’
   palms_build_multimodal: no visible binding for global variable
     ‘time_diff’
   palms_build_multimodal: no visible binding for global variable
     ‘mmt_number’
   palms_build_multimodal: no visible binding for global variable
     ‘mmt_criteria’
   palms_build_multimodal: no visible binding for global variable
     ‘multimodal_fields’
   palms_build_multimodal: no visible binding for global variable ‘mot’
   palms_build_multimodal: no visible binding for global variable
     ‘variable’
   palms_build_multimodal: no visible binding for global variable ‘value’
   palms_build_multimodal: no visible binding for global variable
     ‘trajectory_locations’
   palms_build_multimodal: no visible binding for global variable
     ‘palmsplus’
   palms_build_multimodal: no visible binding for global variable
     ‘triptype’
   palms_build_multimodal: no visible binding for global variable
     ‘start_trip’
   palms_build_multimodal: no visible binding for global variable
     ‘end_trip’
   palms_build_palmsplus: no visible binding for global variable
     ‘palmsplus_fields’
   palms_build_palmsplus: no visible binding for global variable
     ‘identifier’
   palms_build_trajectories: no visible binding for global variable
     ‘trajectory_fields’
   palms_build_trajectories: no visible binding for global variable
     ‘after_conversion’
   palms_build_trajectories: no visible binding for global variable
     ‘trajectory_locations’
   palms_build_trajectories: no visible binding for global variable
     ‘tripnumber’
   palms_build_trajectories: no visible binding for global variable
     ‘identifier’
   palms_epoch: no visible binding for global variable ‘datetime’
   palms_in_polygon: no visible binding for global variable ‘.’
   palms_in_time: no visible binding for global variable ‘school_id’
   palms_in_time: no visible binding for global variable ‘class_id’
   palms_remove_tables: no visible binding for global variable
     ‘palmsplus_domains’
   palms_remove_tables: no visible binding for global variable
     ‘palmsplus_fields’
   palms_remove_tables: no visible binding for global variable
     ‘trajectory_fields’
   palms_remove_tables: no visible binding for global variable
     ‘trajectory_locations’
   palms_remove_tables: no visible binding for global variable
     ‘multimodal_fields’
   Undefined global functions or variables:
     . UQ after_conversion class_id datetime distance_diff domain_field
     duration end_point end_prev end_trip geometry identifier mmt_criteria
     mmt_number mot multimodal_fields name palmsplus palmsplus_domains
     palmsplus_fields school_id start_point start_trip time_diff
     trajectory_fields trajectory_locations tripnumber triptype value
     variable
  1. The use of parse_expr is not flagged by R checks but can become a security vulnerability when using PALMSplus in the cloud, see also this and this post. Ideally users should not be allowed to insert code snippets. If this is unavoidable then my proposal would be that we add a function that checks the code string for potential malicious content before it is evaluated with parse_expr.
  2. Vignettes do not build, probably because they depend on local files. To get around this I have set all R blocks to eval=FALSE. Long term solution would be to let vignettes run based on data that is included in the repository.
  3. Function files do not reflect name of function inside them, and functions.R seems a container for various smaller functions. The combination of both complicates finding functions by name, e.g. when errors refer to functions names. For now this is not super urgent, but would be good to revisit at a later stage.
  4. funs in dplyr has been depricated:
Warning message:
Warning (test_all.R:73:3): Testing normal workflow
`funs()` was deprecated in dplyr 0.8.0.
Please use a list of either functions or lambdas: 

  # Simple named list: 
  list(mean = mean, median = median)

  # Auto named with `tibble::lst()`: 
  tibble::lst(mean, median)

  # Using lambdas
  list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
Backtrace:
 1. palmsplusr::palms_build_days(palmsplus) test_all.R:73:2
 9. dplyr::funs(. * palms_epoch(data)/60)
  1. Test1.R and test2.R rely on your local files. This is not ideal as tests should be written such that they can run anywhere to enable contributors to check that their proposed changes do not break functionality. Also things like setwd() should be avoided. As a temporary fix I have added both tests to the .RBuildignore. However, it would be good if tests could be rewritten such that functionality is tested without this dependency. For example with a tiny dummy dataset or real anonymized dataset.

Next steps:

I suppose 2, 3, and 7 are most critical for Habitus. So, I can start with 2 and 3.

@TheTS
Copy link
Owner

TheTS commented Nov 12, 2021

Nice one, thanks Vincent.

  1. Should be an easy fix
  2. I think this is mostly related to creating the ‘config tables’ in the global environment (to replicate the original SQL design). I suspect most of these can be fixed by moving to the external config file setup, as discussed.
  3. At the time, I couldn’t figure out a way to enable custom formulas to be specified in the config tables, without parse_expr. There may be a better way to do this, but it might require substantial changes to the code.
  4. Yep! The second vignette shouldn’t be built as it demonstrates a large dataset (which cannot be included in the package). It has been my intention to create a better example dataset (I just haven’t got around to it). I will ensure the vignettes use the example data in inst/extdata and data/
  5. Should be an easy fix. As a first step, I can rename the r files for the main functions (days, trajectories, etc) and update the documentation.
  6. I suspect there are a few things like this that could be updated (even if they still pass the check), as several things in the tidyverse and other packages have been updated in the last 4 years. For example, I know I can remove the geodist package, as the st_distance function is now fixed in the sf package.
  7. These two files should have probably never made it to Github. They are just my local test files. All of tests should be housed in tests/testthat

I’ll have a look into the config file this weekend and see if I can get something to work (will keep you posted)

Cheers

Tom

@TheTS
Copy link
Owner

TheTS commented Nov 15, 2021

I have worked on fixing points 1, 4, 5, 6, and 7, so it should pass the checks without any errors. I'll look into the config tables next.

@vincentvanhees
Copy link
Collaborator Author

Thanks Tom. Sorry for not replying any sooner. I have been busy with getting the rest of the Habitus Shiny app to work.

In the upcoming two weeks I can free up some days to help you with preparing Palmsplusr.

For the Habitus app I would like to arrive at a palmsplusr wrapper function, either inside palmsplusr OR as a function that lives inside the app, which applies Palmsplusr to some user-selected datafolder(s)/file(s), aided by a configuration file.

Would it help if I start drafting this wrapper function? Maybe that could help me to familiarise myself with Palmsplusr and from that discover what else I may be able to contribute.

@TheTS
Copy link
Owner

TheTS commented Dec 7, 2021

Hi @vincentvanhees

I have been working on the config_file branch. I've added a wrapper function called palmsplus_shell, which in it's simplest form is:

palmsplus_shell(path_to_habitus_output, path_to_config_file, ...)

I've got it functional at the basic level (without any user GIS files). I'll work on it more over the next few days and come up with a list of questions (e.g. format of HABITUS output, where and how the output is saved).

Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants