-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Group boxplot subarray #26
Conversation
- all_of() should be used to select multiple columns in dplyr::select() - previously, a warning was thrown in boxplotSubarray() when features (SeqIds) were selected to generate the plotting dataset - all_of() was added to the dplyr::select() call to resolve this warning
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #26 +/- ##
=======================================
Coverage ? 42.89%
=======================================
Files ? 29
Lines ? 1119
Branches ? 0
=======================================
Hits ? 480
Misses ? 639
Partials ? 0 ☔ View full report in Codecov by Sentry. |
d27b9cb
to
66b980d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Just a few minor comments.
R/boxplotSubarray.R
Outdated
feats <- getAnalytes(.data) | ||
|
||
if ( !all(reqd_cols %in% names(.data)) ) { | ||
msng_idx <- which(!reqd_cols %in% names(.data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I would use msng <- setdiff(reqd_cols, names(.data))
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's definitely a better option! I was overthinking this, setdiff()
is much simpler.
#' | ||
#' @family boxplots | ||
#' @param .data A `soma_data` or data frame object created via a call to | ||
#' [read_adat()]. | ||
#' @param .data A `soma_data` or data frame object, created from a SomaScan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another unification of args issue ... what do we want to call .data
? I can't remember why I called it this ... but we might want to think about converging on an argname for when it's specifically a soma_adat
object that's expected, even if a data.frame
would ultimately suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throughout SomaPlotr, the plotting functions that take a soma_adat
or data.frame
as input typically use either .data
or data
as the argument. I was thinking of having them all converge on data
, but that is pretty general and doesn't suggest a soma_adat
specifically. Would adat
be more appropriate? I think this change (unifying this argument name across the entire package) will be a separate commit, but I can add it to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it's beyond scope for this PR, but just something to think about.
I would probably steer you towards data=
, simply because 1) it's more common in the R universe, 2) it's used in ggplot2
, 3) might want to avoid confusion with rlang::.data
pronoun that's pretty common. Just some things to think about, though likely down the road when thinking about converging/aligning arg-names.
R/boxplotSubarray.R
Outdated
#' the term "subarray" is analogous to sample, and typically indicates a row | ||
#' or sample in the data. | ||
#' Plots the distribution of all analytes, stratified by subarray, as a | ||
#' boxplot. These plots are intended to be used as a quality control (QC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to consider rewording so as not to confuse with QC
samples which we commonly label QC
in ADATs. I would think there's a way to word-smith to avoid the potential double-meaning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, I'll try to make the statement about QC more specific so it doesn't cause confusion.
R/boxplotSubarray.R
Outdated
#' boxplot. These plots are intended to be used as a quality control (QC) | ||
#' visualization tool for SomaScan data. In SomaScan (`soma_adat`) data | ||
#' format, the term "subarray" is analogous to sample, and typically indicates | ||
#' a row or sample in the data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove "or sample" since we already say that prior to the comma.
- updated documentation to describe required feature columns - added vectors of required feature columns to utils.R - incorporated feature columns into internal data object (plot_data), enabling the use of facet_wrap() on plots created with boxplotSubarray() - added example of grouping boxplotSubarray() plots by a feature column to the documentation - fixes SomaLogic#24
66b980d
to
b09a292
Compare
Overview of Pull Request
Fixes #24
Main changes
facet_wrap()
on a figure created withboxplotSubarray()
. No arguments/parameters were added to the function.Change type
Please check the relevant box(es):
Choose reviewer(s)
Reviewer by Department