Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-add data pre-processing prior to add_difference() and add_p()? #2165

Open
cforne opened this issue Feb 22, 2025 · 1 comment
Open

Re-add data pre-processing prior to add_difference() and add_p()? #2165

cforne opened this issue Feb 22, 2025 · 1 comment
Milestone

Comments

@cforne
Copy link

cforne commented Feb 22, 2025

Dear Daniel:

After updating the gtsummary package, add_difference of a tbl_summary with dichotomous variables is not working. Next error is obtained:

For variable XXXX (YYYY) and "estimate", "statistic", "p.value", "parameter", "conf.low", and "conf.high" statistics: Expecting variable to be either or <numeric/integer> coded as 0 and 1

The dichotomous variable is a factor, coded as "Yes", "No".

In the tbl_summary, the level of the dichotomous variable is displayed on a single row. What is wrong?

@ddsjoberg
Copy link
Owner

Thank you for the post @cforne . It was a conscience decision to no longer perform pre-processing on the data before calculating tests (and other statistical computations) and leave it to the user to pass data that was compatible with the tests they are requesting.

But it seems I did not give enough consideration to situations like these. I'll think on it, but it feels like this functionality should be added back to the package. Sorry for the inconvenience.

library(gtsummary)

trial |> 
  dplyr::mutate(response = factor(response, levels = c(0, 1), labels = c("no", "yes"))) |> 
  tbl_summary(
    by = trt, 
    include = response
  ) |> 
  add_difference() |> 
  as_kable()
#> The following errors were returned during `as_kable()`:
#> ✖ For variable `response` (`trt`) and "estimate", "statistic", "p.value",
#>   "parameter", "conf.low", and "conf.high" statistics: Expecting `variable` to
#>   be either <logical> or <numeric/integer> coded as 0 and 1.
Characteristic Drug A N = 98 Drug B N = 102 Difference 95% CI p-value
response 28 (29%) 33 (34%)
Unknown 3 4
trial |> 
  tbl_summary(
    by = trt, 
    include = grade,
    value = grade ~ "I"
  ) |> 
  add_difference() |> 
  as_kable()
#> The following errors were returned during `as_kable()`:
#> ✖ For variable `grade` (`trt`) and "estimate", "statistic", "p.value",
#>   "parameter", "conf.low", and "conf.high" statistics: Expecting `variable` to
#>   be either <logical> or <numeric/integer> coded as 0 and 1.
Characteristic Drug A N = 98 Drug B N = 102 Difference 95% CI p-value
Grade 35 (36%) 33 (32%)

Created on 2025-02-22 with reprex v2.1.1

@ddsjoberg ddsjoberg changed the title Error in add_difference() Re-add data pre-processing prior to add_difference() and add_p()? Feb 23, 2025
@ddsjoberg ddsjoberg added this to the v2.2.0 milestone Feb 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants