Skip to content

Commit

Permalink
Allow : in select strings (#551)
Browse files Browse the repository at this point in the history
* Allow : in select strings

* lintr

* fix test

* test. comment

* news item
  • Loading branch information
strengejacke authored Oct 10, 2024
1 parent db85541 commit 9dff2ae
Show file tree
Hide file tree
Showing 38 changed files with 217 additions and 83 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: datawizard
Title: Easy Data Wrangling and Statistical Transformations
Version: 0.13.0.1
Version: 0.13.0.2
Authors@R: c(
person("Indrajeet", "Patil", , "[email protected]", role = "aut",
comment = c(ORCID = "0000-0003-1995-6531")),
Expand Down
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# datawizard (development)

CHANGES

* The `select` argument, which is available in different functions to select
variables, can now also be a character vector with quoted variable names,
including a colon to indicate a range of several variables (e.g. `"cyl:gear"`).

BUG FIXES

* `describe_distribution()` no longer errors if the sample was too sparse to compute
Expand Down
11 changes: 8 additions & 3 deletions R/extract_column_names.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@
#' tasks. Can be either
#'
#' - a variable specified as a literal variable name (e.g., `column_name`),
#' - a string with the variable name (e.g., `"column_name"`), or a character
#' vector of variable names (e.g., `c("col1", "col2", "col3")`),
#' - a string with the variable name (e.g., `"column_name"`), a character
#' vector of variable names (e.g., `c("col1", "col2", "col3")`), or a
#' character vector of variable names including ranges specified via `:`
#' (e.g., `c("col1:col3", "col5")`),
#' - a formula with variable names (e.g., `~column_1 + column_2`),
#' - a vector of positive integers, giving the positions counting from the left
#' (e.g. `1` or `c(1, 3, 5)`),
Expand Down Expand Up @@ -116,7 +118,7 @@
#' ```
#'
#' @examples
#' # Find columns names by pattern
#' # Find column names by pattern
#' extract_column_names(iris, starts_with("Sepal"))
#' extract_column_names(iris, ends_with("Width"))
#' extract_column_names(iris, regex("\\."))
Expand All @@ -129,6 +131,9 @@
#' numeric_mean_35 <- function(x) is.numeric(x) && mean(x, na.rm = TRUE) > 3.5
#' extract_column_names(iris, numeric_mean_35)
#'
#' # find range of colum names by range, using character vector
#' extract_column_names(mtcars, c("cyl:hp", "wt"))
#'
#' # rename returned columns for "data_select()"
#' head(data_select(mtcars, c(`Miles per Gallon` = "mpg", Cylinders = "cyl")))
#' @export
Expand Down
57 changes: 45 additions & 12 deletions R/select_nse.R
Original file line number Diff line number Diff line change
Expand Up @@ -139,38 +139,71 @@
# Possibilities:
# - quoted variable name
# - quoted variable name with ignore case
# - quoted variable name with colon, to indicate range
# - character that should be regex-ed on variable names
# - special word "all" to return all vars

.select_char <- function(data, x, ignore_case, regex, verbose) {
# use colnames because names() doesn't work for matrices
columns <- colnames(data)
if (isTRUE(regex)) {
# string is a regular expression
grep(x, columns)
} else if (length(x) == 1L && x == "all") {
# string is "all" - select all columns
seq_along(data)
} else if (any(grepl(":", x, fixed = TRUE))) {
# special pattern, as string (e.g.select = c("cyl:hp", "am")). However,
# this will first go into `.eval_call()` and thus only single elements
# are passed in `x` - we have never a character *vector* here
# check for valid names
colon_vars <- unlist(strsplit(x, ":", fixed = TRUE))
colon_match <- match(colon_vars, columns)
if (anyNA(colon_match)) {
.warn_not_found(colon_vars, columns, colon_match, verbose)
matches <- NA
} else {
start_pos <- match(colon_vars[1], columns)
end_pos <- match(colon_vars[2], columns)
if (!is.na(start_pos) && !is.na(end_pos)) {
matches <- start_pos:end_pos
} else {
matches <- NA
}
}
matches[!is.na(matches)]
} else if (isTRUE(ignore_case)) {
# find columns, case insensitive
matches <- match(toupper(x), toupper(columns))
matches[!is.na(matches)]
} else {
# find columns, case sensitive
matches <- match(x, columns)
if (anyNA(matches) && verbose) {
insight::format_warning(
paste0(
"Following variable(s) were not found: ",
toString(x[is.na(matches)])
),
.misspelled_string(
columns,
x[is.na(matches)],
default_message = "Possibly misspelled?"
)
)
if (anyNA(matches)) {
.warn_not_found(x, columns, matches, verbose)
}
matches[!is.na(matches)]
}
}

# small helper, to avoid duplicated code
.warn_not_found <- function(x, columns, matches, verbose = TRUE) {
if (verbose) {
insight::format_warning(
paste0(
"Following variable(s) were not found: ",
toString(x[is.na(matches)])
),
.misspelled_string(
columns,
x[is.na(matches)],
default_message = "Possibly misspelled?"
)
)
}
}


# 3 types of symbols:
# - unquoted variables
# - objects that need to be evaluated, e.g data_find(iris, i) where
Expand Down
6 changes: 4 additions & 2 deletions man/adjust.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/assign_labels.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/categorize.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/center.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/convert_na_to.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/convert_to_na.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_codebook.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_duplicated.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_extract.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_group.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_peek.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_relocate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_rename.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_replicate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_separate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_tabulate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_to_long.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_unique.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/data_unite.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/describe_distribution.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 9dff2ae

Please sign in to comment.