Skip to content

Commit

Permalink
Merge pull request #224 from SebKrantz/development
Browse files Browse the repository at this point in the history
Update NEWS regarding #221.
  • Loading branch information
SebKrantz authored Feb 5, 2022
2 parents d6b092e + 76501f2 commit ea4fb4e
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# collapse 1.7.5

* In the development version on GitHub, a `.` was added to the first argument of functions `fselect`, `fsubset`, `colorder` and `fgroup_by`, i.e. `fselect(x, ...) -> fselect(.x, ...)`. The reason for this is that over time I added the option to select-rename columns e.g. `fselect(mtcars, cylinders = cyl)`, which was not offered when these functions were created. This presents problems if columns should be renamed into `x`, e.g. `fselect(mtcars, x = cyl)` fails, see e.g. #221 . Renaming the argument to `.x` somewhat guards against such situations. I think this API change is worthwhile to implement, because it makes the package more robust going forward, and usually the first argument of these functions is never invoked explicitly. For now it remains in the development version which you can install using `remotes::install_github("SebKrantz/collapse")`. If you have strong objections to this change (because it will break your code or you know of people that have a programming style where they explicitly set the first argument of data manipulation functions), please let me know!

* Also ensuring tidyverse examples are in `\donttest{}` and building without the *dplyr* testing file to avoid issues with static code analysis on CRAN.

* 20-50% Speed improvement in `gsplit` (and therefore in `fsummarise`, `fmutate`, `collap` and `BY` *when invoked with base R functions*) when grouping with `GRP(..., sort = TRUE, return.order = TRUE)`. To enable this by default, the default for argument `return.order` in `GRP` was set to `sort`, which retains the ordering vector (needed for the optimization). Retaining the ordering vector uses up some memory which can possibly adversely affect computations with big data, but with big data `sort = FALSE` usually gives faster results anyway, and you can also always set `return.order = FALSE` (also in `fgroup_by`, `collap`), so this default gives the best of both worlds.
Expand Down

0 comments on commit ea4fb4e

Please sign in to comment.