You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the functions well organized into individual files
pkgdown site very useful
all those wrappers for backwards compatibility of the functions will be useful
whole package is of great use to the program and worth the time invested to build it
code itself is largely clean, simple, easy to read and interpret. No massive cascades of nested functions, phew!
great package name - short, descriptive, funny
Things to change
For the function get_variety_db() in access_controlled_vocab.R, the default argument select_before = "2021-01-01" should probably changed to the date the user is accessing the file: select_before = Sys.Date() since they should grab all the data.
Specify in the appropriate place that %<% pipe should be imported from magrittr.
Areas for improvement:
Use fewer dependencies when possible. This reduces overall maintenance required to keep the package working. factcuratoR clearly requires readxl, fuzzyjoin, validate, and dm as part of its core functionality, so no need to consider removing those. However, some packages have slim usages and we may be able to consider refactoring the code so they can be removed: tibble, lubridate, readr, and stringr. We could also consider refactoring the code to remove dplyr, but that package is used extensively and perhaps not worth the time to refactor since this would be a large undertaking. I am unsure about tidyr and purrr. Overall though, this is only my best guess and I welcome input on this topic.
Moving forward, here some common replacements:
tidyverse function
base R equivalent
._join()
merge()
pivot_.()
reshape()
case_when()
ifelse()
group_by()
aggregate()
map()
lapply()
distinct()
unique()
count()
table()
These functions take more effort to use than tidyverse functions, but they are less prone to break the package since at this point they are rarely if every changed.
*Note: for lubridate::parse_date_time(), you can probably run As.Date() and try several formats you think are likely to work. In some cases (e.g., access_controlled_vocab.R), it might simplify the code.
An FYI on the term 'recursive': this is more than a function that loops or repeats itself. It is a function that calls itself -- a crazy concept to wrap your mind around. Recursive functions behave very oddly and can easily get caught in an infinite loop, which is why their usage is strongly discouraged across all programming languages.
In the file create_dm.R, there are a few for loops. It is absolutely not worth our time to fix this, but in the future, consider running apply() commands instead of for loops. These are much faster overall (but in our case, the time spent refactoring the code will vastly outstrip time saved in code execution). for loops are valuable when you really need the output from the previous loop to move forward (e.g. in MCMC), but if you don't, apply() will run the command in parallel.
Functions can use more error checking (that the user has supplied expected object types).
The text was updated successfully, but these errors were encountered:
Other replies:
for create_dm.R, I am pretty sure I had to use a for loop because I was "modifying in place." I was having trouble with lapply() so I used a for loop based on what was in the old version Advanced R by Wickham: http://adv-r.had.co.nz/Functionals.html#functionals-not
Strengths:
Things to change
get_variety_db()
in access_controlled_vocab.R, the default argumentselect_before = "2021-01-01"
should probably changed to the date the user is accessing the file:select_before = Sys.Date()
since they should grab all the data.%<%
pipe should be imported from magrittr.Areas for improvement:
Moving forward, here some common replacements:
._join()
merge()
pivot_.()
reshape()
case_when()
ifelse()
group_by()
aggregate()
map()
lapply()
distinct()
unique()
count()
table()
These functions take more effort to use than tidyverse functions, but they are less prone to break the package since at this point they are rarely if every changed.
*Note: for
lubridate::parse_date_time()
, you can probably runAs.Date()
and try several formats you think are likely to work. In some cases (e.g., access_controlled_vocab.R), it might simplify the code.for
loops. It is absolutely not worth our time to fix this, but in the future, consider runningapply()
commands instead offor
loops. These are much faster overall (but in our case, the time spent refactoring the code will vastly outstrip time saved in code execution).for
loops are valuable when you really need the output from the previous loop to move forward (e.g. in MCMC), but if you don't,apply()
will run the command in parallel.The text was updated successfully, but these errors were encountered: