Below is a description of the folders and the files we find in these
- data: Files containing dummy data to start setting up the scripts
- R: Contains the scripts to extract, and preprocess the patient and the product data.
- patient_tracker_extract_helper : Helper functions to extract the data for the patients (needed to run the patient_tracker_extract script)
- helper_product_data: Helper functions to extract the data for the products
- patient_tracker_extract: Script that reads in the different raw (Excel) trackers for each clinic and year and extracts the data into an machine readable table.
- patient_tracker_format: Script that reads in the output of patient_tracker_extract and reformats columns according to the codebook indications, performs checks and removes duplicates (i.e., patients whose information is copied across months but remains unchanged). Returns a dataframe that is ready to input in the database and another dataframe indicating the locations of errors or non-readable data.
- 4ADMonthlyTrackerCodebook: Codebook containing the information on the variables that we are extracting from the trackers (patient and product data). Also contains tabs listing the different labels that one variable may have together with its standardized formulation.
We use the renv package to manage dependencies.
This project was setup with renv
and uses a local .Rprofile
file that activates renv
. The first time you open this project in RStudio, this will check if renv
is already installed, and, if that is not the case, install it.
You are then informed about the difference between your local R environment and the packages used for this project. Check your console in RStudio after opening. You should see something like:
* One or more packages recorded in the lockfile are not installed.
* Use `renv::status()` for more details.
You can use renv::status()
to see which packages will be installed. Once you are ready, run
renv::restore()
and renv
will install all packages with the version as stated in the renv.lock
file.
See collaborating for the full details.
While working on a project, you or your collaborators may need to update or install new packages in your project. When this occurs, you’ll also want to ensure your collaborators are then using the same newly-installed packages. In general, the process looks like this:
- A user installs, or updates, one or more packages in their local project library;
- That user calls
renv::snapshot()
to update therenv.lock
lockfile; - That user then shares the updated version of
renv.lock
with their collaborators (meaning that this file is commited and pushed viagit
); - Other collaborators then call
renv::restore()
to install the packages specified in the newly-updated lockfile.
If you want to add another package to this project, install the package with renv::install()
instead of package.install()
.
Note: Not all packages are locked by renv
. For example, if you want to preview this Readme with RStudio, RStudio will likely ask to install or update additional packages like markdown
. That is ok and intended because these additional packages are not used by the project code so it is up to you to install them or not.
Once you have installed the dependencies with renv::restore()
you can go on and load our "package".
However, you need one more dependency, and that is devtools. This is because renv
does only lock required packages that are needed to run the code, not packages to develop the code (like devtools
).
So make sure to run
install.packages("devtools")
It is also possible that this package was already downloaded because we have this code in the .Rprofile
file that is executed automatically when RStudio is opened:
if (interactive()) {
require("devtools", quietly = TRUE)
# automatically attaches usethis
devtools::load_all()
}
Now you have access to devtools
, which is in fact a set of packages that are installed, like usethis
and roxygen2
.
To load all functions within the ./R
folder, run
devtools::load_all()
This will make available the a4d
package in the global environment, giving you access to all functions within it, as well as the core packages of tidyverse
because it is listed at "Depends" in the DESCRIPTION file.
You can now go ahead and run one of the two main scripts:
run_a4d_patient_data.R
run_a4d_product_data.R
We will all have the encrypted data stored on different folders within our computers. To account for the different file paths for every user and to speed the selection of the tracker files (where the data is stored), there is the following solution:
- Run
usethis::edit_r_environ()
- This should open the following file:
.Renviron
- This should open the following file:
- Add the following line:
A4D_DATA_ROOT = "your_path"
- Replace
"your_path"
with the path to your A4D tracker files- E.g.
A4D_DATA_ROOT = "D:/A4D"
- E.g.
- Save the
.Renviron
file - You are good to go and will not need to re-select the folder containing the tracker files when running
select_A4D_directory()
. This function will now get the correct path from the.Renviron
file.
For a short overview, see cheatsheets.
If you want to change any (code) file, add new files or delete existing files, please follow these steps:
- only once:
git clone
this repository - switch to the develop branch:
git checkout develop
orgit switch develop
- update develop:
git pull
- create a new branch:
git checkout -b <issue-no>-<tilte>
(or create the branch in GitHub and just switch to it, no-b
needed then) - do your code changes, create new R files with
usethis::use_r()
and new test files withusethis::use_test()
- load and test/execute your code changes:
devtools::load_all()
- run the tests:
devtools::test()
a. Fix any problems until all tests are green and make sure your changes do not break other code - check the package:
devtools::check()
a. Fix any problems until all checks are green - document your functions: use roxygen documentation by comments starting with
#'
- update documentation:
devtools:document()
- optional: add additional documentation to the README or create a RMarkdown file with examples
- get latest changes from develop:
git merge develop
a. if there are any conflicts, solve them - create a (final) commit with all your changes:
git add <files>
andgit commit -m"<message>"
- push your changes:
git push
(for a newly created local branch, you will need to set an up-stream first, just follow the instructions on the command line) - Check the GitHub workflows (cicd pipelines) for your branch and fix any problems
- Create a PR with develop as target
- Again, check the GitHub workflows for your PR and fix any problems
In addition to this general workflow, there are some additional steps required if you made use of external packages not yet stored in renv.lock
:
- install the package: Use
renv::install()
if you want to use this package only for development, orusethis::use_package()
to add the package to the DESCRIPTION file a. if you want to update a package, you can also userenv::install()
, without arguments it will update all listed packages in the lock file - use the package with the
package::fun()
syntax in your code - use
renv::snapshot()
to update therenv.lock
- make sure to add the
renv.lock
file with your PR