Releases: ddsjoberg/gtsummary
gtsummary 2.0.3
New Features and Functions
-
Added function
tbl_hierarchical()
,tbl_hierarchical_count()
,tbl_ard_hierarchical()
,brdg_hierarchical()
, andpier_summary_hierarchical()
. Consider these functions as a preview. We will be making changes without the full deprecation cycle in the coming releases. (#1872) -
Adding the
style_*(prefix, suffix)
andlabel_style_*(prefix, suffix)
for adding a string before or after the formatted results. These arguments have not been added to the p-value formatting functions. (#1690) -
Added argument
tbl_ard_summary(overall)
. WhenTRUE
, the ARD is parsed into primary ARD and the Overall ARD and we runtbl_ard_summary() |> add_overall()
. (#1940) -
Added
add_stat_label.tbl_ard_summary()
method. (#1969)
Other Updates
-
Headers in {gt} tables being exported to PDF do not support the
\n
line breaker. Previously, line breakers were stripped from the header in theprint.gtsummary()
S3 method. But this did not apply to users utilizingas_gt()
to further customize their tables. As a result, the line breaking strip has been migrated toas_gt()
. (#1960) -
Migrated the
tbl_survfit.list(conf.level)
up totbl_survfit.data.frame(conf.level)
where the confidence level is passed tosurvival::survfit()
. -
Update in
tbl_ard_summary()
to better handle non-standard ARDs (i.e. not our typical continuous or categorical summaries) by assigning them a default summary type. (#1991) -
Made the
oneway.test()
available inadd_p.tbl_continuous()
. (#1970) -
Removed the deprecated
'aov'
test from thetests.R
file listing available tests. (#1970) -
Removed documentation for the
add_overall.tbl_ard_summary(digits)
argument, which was never meant to be a part of this function. (#1975)
Bug Fixes
-
Bug fix in
add_overall.tbl_custom_summary()
due to extraneous argument being passed totbl_custom_summary()
. (#2027) -
Bug fix in
add_p.tbl_survfit()
when the original call includedtbl_survfit(type)
specification. (#2002) -
Removed the
"tbl_summary-arg:statistic"
theme that was incorrectly added totbl_continuous()
.
gtsummary 2.0.2
Updates to address regressions in the v2.0.0 release:
- The default
add_glance_*(glance_fun)
function fixed formice
models with class'mira'
. (#1912) - We can again report unweighted statistics in the headers of
tbl_svysummary()
tables. (#1911) tbl_uvregression()
properly handles variables specified in theinclude
argument with non-syntactic names. (#1932)NA
values can again be specified inadd_stat_label(label)
to suppress a statistic label from being placed. (#1937)- Corrected bug in
tbl_cross()
where thedigits
argument was not always being passed accurately totbl_summary()
. (#1943)
Other updates
-
The total N is now returned with
.$cards
using thecards::ard_total_n()
function for the calculation. -
The default headers for
tbl_ard_*()
functions no longer include counts, as these are not required data to be passed along in the ARD input. -
The summary statistics of the
'by'
variable are no longer required in the ARD for functionstbl_ard_summary()
andtbl_ard_continuous()
. When the tabulation summary statistics are passed, they are available to place in the header dynamically. (#1860) -
The
tbl_ard_wide_summary()
function no longer requires the results fromcards::ard_attributes()
to create tables. (#1873) -
Added the
label
argument to functionstbl_ard_summary()
,tbl_ard_wide_summary()
, andtbl_ard_continuous()
. (#1850) -
The
add_glance*(glance_fun)
argument's default value has been updated to an S3 generic, allowing bespoke handling for some regression classes. (#1822) -
Added
add_overall.tbl_ard_summary()
S3 method. (#1848) -
Added function
tbl_likert()
for summarizing ordered categorical (or Likert scales) data as well as the associatedadd_n.tbl_likert()
S3 method. (#1660) -
Fix where error or warning condition messages containing curly brace pairs could not be printed.
-
Updated the
show_header_names()
output to include the values that may be dynamically placed in the headers. Additionally, theinclude_example
andquiet
arguments have been deprecated. (#1696)
gtsummary 2.0.1
Updates to address regressions in the v2.0.0 release:
- Restore functionality of
inline_text.tbl_summary(column)
argument to specify a by level when the by variable is a factor. (#1883) - Correct the order of the columns when the
tbl_summary(by)
variables has ten or more levels. (#1877) - Re-establishing strong link between header by variable levels and those in the table body to ensure correct ordering of columns in
tbl_summary()
. - The
tbl_survfit(times)
argument accepts integers once again. (#1867) - Fix in
tbl_uvregression()
for theformula
argument when it includes a hard-coded column name, e.g.formula='{y} ~ {x} + grade'
. The hard-coded variable name is now removed from theinclude
argument. (#1886) - Fix for non-Base R classes tabulated with
tbl_summary()
that would not coerce to character correctly afterunlist()
. (#1893) - Updated the styling function from
style_percent()
tostyle_number(scale=100)
when user passes an integer to change the rounding of percentages intbl_summary()
. (#1899)
Other updates
-
The {tidycmprsk} dependency has been removed and the
tbl_regression.tidycrr()
method has been migrated to the {tidycmprsk} package. (#1865) -
The class of
tbl_split()
objects has been updated from"tbl_split"
toc("tbl_split", "list")
. (#1854) -
Updated the default value of
tbl_ard_summary(missing)
to"no"
. (#1857) -
Line breaks (i.e.
'\n'
) are now auto-stripped from gt-rendered tables when in an R markdown or Quarto environment. (#1896)
gtsummary 2.0.0
New Features
-
Clearer error messages have been introduced throughout the package. We've adopted {cli} for all our messaging to users. Our goal was to return a clear message to users for all scenarios.
-
Added functions
tbl_wide_summary()
andtbl_ard_wide_summary()
for simple summaries across multiple columns. -
The {gt} package is now the default printer for all Quarto and R markdown output formats.
- Previously, when printing a gtsummary table in a Quarto or R markdown document, we would detect the output format and convert to gt, flextable, or kable to provide the best-looking table. The {gt} package has matured and provides lovely tables for nearly all output types, and we have now made {gt} the default table drawing tool for all gtsummary tables. These output types are still supported.
-
Previously, if I wanted a single statistic to be reported to additional levels of precision in a
tbl_summary()
table, I would need to specify the precision of every summary statistic for a variable. Now, we can simple update the one statistic we're interested in with a named list of vector:tbl_summary(digits = age ~ list(sd = 2))
. -
New functions
tbl_ard_summary()
andtbl_ard_continuous()
have been added. These provide general tools for creating bespoke summary tables. Rather than accepting a data frame, these functions accept an ARD object (Analysis Results Dataset often created with the {cards} or {cardx} packages). The ARD objects align with the emerging CDISC Analysis Results Standard. ARDs are now used throughout the package. See below under the "Internal Storage" heading. -
The default
add_global_p(anova_fun)
argument value has been updated toglobal_pvalue_fun()
, which is an S3 generic. The default method still callscar::Anova()
for the calculation. Methods fortidycmprsk::crr()
andgeepack::geeglm()
have been added that wrapaod::wald.test()
as these regression model types are not supported bycar::Anova()
. -
The
add_ci.tbl_summary()
S3 method has been updated with new ways to calculate the confidence interval: Wald with and without continuity correction, Agresti-Coull, and Jeffreys. -
Added a family of function
label_style_*()
that are similar to thestyle_*()
except they return a styling function, rather than a styled value. -
Functions
tbl_summary()
andtbl_svysummary()
have gained themissing_stat
argument, which gives users great control over the statistics presented in the missing row of a summary table.
Internal Storage
-
Greater consistency has been put in place for all calculated statistics in gtsummary. Previously, each function handled its own calculations and transforming these statistics into data frames that would be printed. Now each function will first prepare an Analysis Result Dataset (ARD), and ARDs are converted to gtsummary structures using bridge functions (prefixed with
brdg_*()
). The bridge functions will be exported to allow anyone to more easily extend gtsummary functions.- These ARDs are now used to calculate the summary statistics for nearly every function in gtsummary. The raw summary statistics are saved in
.$cards
. - Users who previously accessed the internals of a gtsummary object will find the structure has been updated, and this may be an important breaking change.
- These ARDs are now used to calculate the summary statistics for nearly every function in gtsummary. The raw summary statistics are saved in
-
Calculations that require other packages have been placed in another package called {cardx}. This package creates ARD objects with the calculated statistics.
-
In
tbl_regression()
, the.$model_obj
is no longer returned with the object. The modeling object is, and always has been, available in.$inputs$x
. -
When the gtsummary package was first written, the gt package was not on CRAN and the version of the package that was available did not have the ability to merge columns. Due to these limitations, the
"ci"
column was added to show the combined"conf.low"
and"conf.high"
columns. Column merging in both gt and gtsummary packages has matured over the years, and we are now adopting a more modern approach by using these features. As a result, the"ci"
column will eventually be dropped from.$table_body
. By using column merging, the conf.low and conf.high remain numeric and we can to continue to update how these columns are formatted. Review?deprecated_ci_column
for details.
Documentation
- The vignettes "FAQ+Gallery",
tbl_summary()
Tutorial,tbl_regression()
Tutorial, and Quarto+R Markdown have been converted to articles. The URLs on the website have not changed for these pages, but the vignettes are no longer is bundled in the package. This change allows us to provide better documentation, utilizing more tools that don't need to be included in the package.
Minor Improvements
-
Argument
add_p.tbl_summary(adj.vars)
was added to more easily add p-values that are adjusted/stratified by other columns in a data frame. -
Messaging and checks have been improved when tidyselect is invoked in the package, i.e. when the tilda is used to select variables
age ~ "Patient Age"
. The subset of variables that can be selected is now reduced the variables present in the table. For example, if you have a summary table of patient age (and only patient age), and age is a single column from a data set of many columns and you mis-spell age (aggge ~ "Patient Age"
), the error message will now ask if you meant"age"
instead of listing every column in the data set.- Note that as before, you can circumvent tidyselect by using a named list, e.g.
list(age = "Patient Age")
.
- Note that as before, you can circumvent tidyselect by using a named list, e.g.
-
Added the following methods for calculating differences in
add_difference.tbl_summary()
: Hedge's G, Paired data Cohen's D, and Paired data Hedge's G. All three are powered by the {effectsize} package. -
The counts in the header of
tbl_summary(by)
tables now appear on a new line, e.g."**{level}** \nN = {n}"
. -
In
tbl_summary()
, the default calculation for quantiles (e.g. statistics of the form"p25"
or"p75"
) has been updated with typequantile(type=2)
. -
In
tbl_summary()
, dates and times showed the minimum and maximum values only by default. They are now treated as all other continuous summaries and share their default statistics of the median and IQR. -
Previously, indentation was handled with
modify_table_styling(text_format = c("indent", "indent2"))
, which would indent a cell 4 and 8 spaces, respectively. Handling of indentation has been migrated tomodify_table_styling(indent = integer())
, and by default, the label column is indented to zero spaces. This makes it easier to indent a group of rows. -
The inputs for
modify_table_styling(undo_text_format)
has been updated to mirror its counterpartmodify_table_styling(text_format)
and no longer acceptsTRUE
orFALSE
. -
The values passed in
tbl_summary(value)
are now only checked for columns that are summary type"dichotomous"
. -
The gtsummary selecting functions, e.g.
all_categorical()
,all_continuous()
, etc., are now simplified by wrappingtidyselect::where()
, which not available when these functions were originally written. Previously, these functions would error if used out of context; they now, instead,select no columns when used out-of-context. -
The design-based t-test has been added as possible methods for
add_difference.tbl_svysummary()
and is now the default for continuous variables. -
When
add_ci()
is run afteradd_overall()
, the overall column is now populated with the confidence interval. (#1569) -
Added
pkgdown_print.gtsummary()
method that is only registered when the pkgdown package is loaded. This enables printing of gtsummary tables on the pkgdown site in the Examples section. (#1771) -
The package now uses updated
survey::svyquantile()
function to calculate quatiles, which was introduced in survey v4.1
Bug fixes
- Fix in
add_difference()
for paired t-tests. Previously, the sign of the reported difference depended on which group appeared first in the source data. Function has been updated to consistently report the difference as the first group mean minus the second group mean. (#1557)
Lifecycle changes
-
A couple of small changes to the default summary type in
tbl_summary()
have been made.- If a column is all
NA_character_
intbl_summary()
, the default summary type is now"continuous"
, where previously it was"dichotomous"
. - Previously, in a
tbl_summary()
variables that werec(0, 1)
,c("no", "yes")
,c("No", "Yes")
, andc("NO", "YES")
would default to a dichotomous summary with the1
andyes
level being shown in the table. This would occur even in the case when, for example, only0
was observed. In this release, the line shown for dichotomous variables must be observed OR the unobserved level must be either explicitly defined in a factor or be a logical vector. This means that a character vector of all"yes"
or all"no"
values will default to a categorical summary instead of dichotomous.
- If a column is all
-
When using the
tbl_summary(value)
argument, we no longer allow unobserved levels to be used unless it is an unobserved factor level or logical level. -
The
quiet
argument has been deprecated throughout the package, except intbl_stack()
. Documentation has been updated to ensure clarity in all methods. -
The
inline_text(level)
argument now expects a character value. -
The
tbl_butcher(include)
argument now only accepts character vectors. -
The following theme elements have been deprecated:
- These theme elements will eventually be removed from the package:
'tbl_summary-arg:label'
,'add_p.tbl_summary-arg:pvalue_fun'
,'tbl_regression-arg:pvalue_fun'
,'tbl_regression-chr:tidy_columns'
.- The
pvalue_fun
elements should switch to the package-wide theme for p-value styling--'pkgwide-fn:pvalue_fun'
.
- The
- These theme elements have been removed from the package immediately due to str...
- These theme elements will eventually be removed from the package:
gtsummary 1.7.2
-
Removed messaging about the former auto-removal of the
tbl_summary(group)
variable from the table: a change that occurred 3+ years ago in gtsummary v1.3.1 -
Fix in
as_flex_table()
where source notes were not accurately rendered. (#1520) -
Fix in column order when
add_ci()
is run afteradd_overall(last=TRUE)
. Previously, the overall columns were placed in front. (#1525) -
Line breaks (i.e.
\n
) are now removed from column headers and table cells whenas_kable()
is called. (#1526) -
Fix in
as_gt()
where columns with common spanning headers were gathered. Corrected withgt::tab_spanner(gather = FALSE)
. (#1527) -
Fix in
remove_row_type()
where header rows forcontinuous2
type variables was not removed when requested. (#1507) -
Fix where some default
add_p.tbl_summary()
categorical tests were chi-squared when it should have been Fisher's exact test. This misclassification occurred in some cases when there was a large differential in the missing pattern for one of the variables in the cross table. (#1513) -
Fix in
add_overall(col_label=)
where specified label was not always placed. (#1505)
gtsummary 1.7.1
New Functions
- Added
as.data.frame()
S3 method for gtsummary class.
New Functionality
-
The
tbl_svysummary()
function may now report the design effect, e.g.tbl_svysummary(statistic = ~"{deff}")
. (#1486) -
Added French translations for new marginal effects tidiers housed in {broom.helpers}. (#1417)
-
Added theme elements to control the default headers in
tbl_svysummary()
. (#1452) -
Improved error messaging in
tbl_uvregression()
whenmethod=
argument is not correctly specified. (#1469) -
Updates to account for changes in {forcats} v1.0.0 and {dplyr} v1.1.0.
-
tbl_svysummary()
can now report design effects (#1486)
Bug Fixes
-
Fix in the footnote of
add_overall()
when run aftertbl_continuous()
. (#1436) -
Updating the levels of precision used in
round2()
, which is used in the background for every rounded/formatted number in a gtsummary table. (#1494) -
Bug fix when a subset of CIs are requested in
add_ci(include=)
. (#1484) -
Update in
as_hux_table()
to ensure the Ns in header are not incorrectly auto-formatted by {huxtable}. -
Fix in the
style_*()
family of functions. The attributes of the input vector--excluding the class--are retained. (#1460) -
Updated
style_ratio()
to now format negative values. -
Bug fix in
add_ci.tbl_svysummary()
for dichotomous variables. -
add_ci.tbl_svysummary()
now takes properly into account thepercent
argument (#1470)
gtsummary 1.7.0
Breaking Changes
-
Updated the default argument values in
tidy_robust(vcov=NULL, vcov_args=NULL)
. Users must specify the type of robust standard errors using these arguments. -
Fully removed deprecated items that were originally deprecated in v1.2.5 (released 3 years ago).
add_p(exclude=)
,as_gt(exclude=)
,as_kable(exclude=)
,as_tibble.gtsummary(exclude=)
,tbl_regression(exclude=)
,tbl_uvregression(exclude=)
tbl_summary_()
,add_p_()
add_global_p(terms=)
New Functions
- New function
add_ci.tbl_svysummary()
for adding confidence intervals totbl_svysummary()
summary statistics. (#965)
New Functionality
-
Arguments pass via the dots in
tbl_uvregression(...)
are now passed tobroom.helpers::tidy_plus_plus(...)
. (#1396) -
Added new theme elements to control the default headers in
tbl_summary()
. (#1401) -
All examples that previously used
<br>
for line breaks in gt tables have been updated to use\n
. Additionally, the"qjecon"
journal theme has been updated to use the updated line breaker as well. (#1311) -
Now allowing for mixed-class numeric types in
tbl_summary()
, such thatinline_text()
will not throw an error when the pattern argument is specified. -
Added
stats::mood.test()
toadd_p.tbl_summary()
. (#1397)
New Documentation
- Added a new article illustrating how to place gtsummary tables into Shiny applications. (#1335)
Bug Fixes
gtsummary 1.6.3
-
The
as_flex_table()
function now recognizes markdown bold (**
) and italic (_
) syntax in the headers and spanning headers. Restrictions apply. See help file for details. Users can no longer place sets of double stars and underscores without the text being formatted as markdown syntax. (#1361) -
The
modify_caption()
function now works with tables created withgtreg::tbl_listing()
that do not contain a column named"label"
. (#1358) -
Functions
tbl_summary()
andtbl_svysummary()
now support"{n}"
,"{p}"
, and"{level}"
when noby=
variable is present for use in functions likemodify_header()
. For example, the following previously invalid code works well for both the overall column and the stratified columns:trial %>% tbl_summary(by = trt) %>% add_overall() %>% modify_header(all_stat_cols() ~ "**{level}**, N = {n}")
For the survey summary, the unweighted variants are also available. (#1366)
-
Added experimental feature where additional arguments can be passed to
broom.helpers::tidy_plus_plus()
viatbl_regression(...)
. (#1383) -
Updated the arguments in
tidy_robust()
to account for updates in {parameters}. (#1376) -
Allowing for 'survfit' objects of class
survfit2
inadd_nevent.tbl_survfit()
. (#1389) -
Added
oneway.test()
test toadd_p()
. (#1382) -
Now using {ggstats} to plot regression model coefficients via
plot()
instead of {GGally}. (#1367) -
Bug fix in
tbl_custom_summary()
: the full dataset (including missing observations) is now properly passed asfull_data
(#1388)
gtsummary 1.6.2
-
The following updates were made to the indentation implementation for gt output:
- Previously, only HTML output was able to indent for gt tables, and this was implemented via
gt::tab_style()
. - Indentation is now available for HTML, PDF, and Word and is implemented by adding unicode non-breaking spaces to the data frame via
gt::text_transform()
. - The "names" for the indentation calls have been updated to
"indent"
and"indent2"
. This change should affect very very few users. If you're not sure what the names refer to, then this does not affect you. - Indentation for RTF does not currently work. Instead of indented columns, irregular unicode characters are shown. This issue will be addressed in a future gt release. If you do use RTF output, and would like your output to be identical to what it was before this update, use
as_gt(include = -indent)
.
- Previously, only HTML output was able to indent for gt tables, and this was implemented via
-
A link to the cheat sheet has been added to the website's navigation bar.
-
Added additional options to
remove_row_type(type = c("level", "all"))
.- Use
type = "all"
to remove all rows associated with the variable(s) specified inremove_row_type(variables=)
. - Use
type = "level"
in conjunction with new argumentlevel_values=
to remove specified levels for a variable, or do not use the new argument to remove all levels for categorical variables.
- Use
-
Added the standard error of means to the list of available statistics for continuous data summaries in
tbl_svysummary()
. (#1291) -
Added Dutch language translations. (#1302)
-
Updated
add_significance_stars()
to accept any gtsummary table (instead of only regression model summaries) and to work withadd_global_p()
(#1320) -
Added the
"var_type"
hidden column to the output oftbl_survfit()
. This addition ensures the table will work withremove_row_type()
. (#1343) -
Updated calls to
round()
in thestyle_*()
functions toround2()
, which implements classic rounding rules. (#1304) -
Fixed bug in
style_sigfig()
with numbers close to the thresholds. (#1298) -
Fixed bug when a column named
"variable"
was passed totbl_custom_summary(by=)
, which resulted in an error. (#1285) -
Bug fix in
as_tibble(fmt_missing = TRUE)
. Previously, missing assignments applied to more than one row were being ignored. (#1327) -
Bug fix in column alignment with
tbl_stack()
foras_kable_extra()
output. (#1326)
gtsummary 1.6.1
New Functionality
-
Added the standard error of proportions to the list of available statistics for categorical data summaries in
tbl_svysummary()
. (#1187) -
Added Tarone-Ware test to
add_p.tbl_survfit()
(#732) -
Updated
add_global_p()
to handletbl_uvregression()
objects where users specified thex=
argument (wheny=
argument is more common). (#1260)
Other Updates
-
Updated start-up messaging. (#1228)
-
The
paired.wilcox.test
available inadd_p.tbl_summary()
andadd_difference.tbl_summary()
was mistakenly marked as returning a difference, but it does not. The documentation has been corrected, which results in improved messaging to the user when the test is selected inadd_difference()
. (#1279) -
Improved error messages for paired tests in
add_p()
andadd_difference()
whengroup=
argument is not specified. (#1273) -
Added argument
with_gtsummary_theme(msg_ignored_elements=)
argument. Use this argument to message users if any theme elements will be overwritten and therefore ignored inside thewith_gtsummary_theme()
call. (#1266) -
Swapped
gt::fmt_missing()
forgt::sub_missing()
as the former is now deprecated. (#1257) -
The checks for
"haven_labelled"
class are now only performed for the variables indicated ininclude=
andby=
intbl_summary()
andtbl_svysummary()
. The checks intbl_uvregression()
andtbl_survfit.data.frame()
are only applied to the variables ininclude=
, e.g. no checking for the outcome variable(s). -
Updates to labels and default formatting functions of unweighted statistics presented in
tbl_svysummary()
. (#1253) -
Adding additional structural checks in
tbl_merge()
andinline_text()
to provide better error messaging. (#1248) -
Added
tbl_regression.crr()
method with messaging recommending use oftidycmprsk::crr()
instead. (#1237) -
The experimental support for
ftExtra::colformat_md()
inas_flex_table()
has been removed. The function requires evaluated YAML paths and does not allow un-evaluated references likebibliography:: "`r here::here()`"
. (#1229) -
Update for
tbl_summary(by=)
that now allows for a column named"variable"
to be passed. (#1234) -
Added theme element to control what missing statistic is shown in summary tables with options to display number or percent missing or non-missing or total number of observations. (#1224)
-
Renamed
modify_cols_merge()
tomodify_column_merge()
to be inline with the othermodify_column_*()
functions.
Bug Fixes
-
Fix in
as_kable_extra()
when output format is'latex'
where a cell that had been bold or italicized had special characters double-escaped. Added a condition not to escape special characters in these styled cells. (#1230) -
Fix in
with_gtsummary_theme()
. The function restored any previously set theme, but inadvertently included the temporary theme along with it.