Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected error when using is.na() with dbplyr::filter() #1509

Closed
sjdshapiro opened this issue Jun 3, 2024 · 1 comment
Closed

unexpected error when using is.na() with dbplyr::filter() #1509

sjdshapiro opened this issue Jun 3, 2024 · 1 comment

Comments

@sjdshapiro
Copy link

sjdshapiro commented Jun 3, 2024

When trying to filter NAs using is.na() in filter(), dbplyr is not working as expected.

I have code that I regularly run and it was working as expected until recently (maybe a month or so ago, I'm not sure).

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(dbplyr)
#> 
#> Attaching package: 'dbplyr'
#> The following objects are masked from 'package:dplyr':
#> 
#>     ident, sql

mf <- memdb_frame(g = c(NA, 1, 2, 2, 2), x = 1:5, y = 5:1)

mf_filtered <- mf |> 
  filter(is.na(g))

# I expect the `mf_filtered ` object to give me the filtered rows of `mf`, but there is
# an unexpected error if I try to run mf_filtered .
mf_filtered 
#> Error in `collect()`:
#> ! Failed to collect lazy table.
#> Caused by error:
#> ! near ")": syntax error
#> Backtrace:
#>      ▆
#>   1. ├─base::tryCatch(...)
#>   2. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>   3. │   ├─base (local) tryCatchOne(...)
#>   4. │   │ └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>   5. │   └─base (local) tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
#>   6. │     └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>   7. │       └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>   8. ├─base::withCallingHandlers(...)
#>   9. ├─base::saveRDS(...)
#>  10. ├─base::do.call(...)
#>  11. ├─base (local) `<fn>`(...)
#>  12. └─global `<fn>`(input = base::quote("legal-mink_reprex.R"))
#>  13.   └─rmarkdown::render(input, quiet = TRUE, envir = globalenv(), encoding = "UTF-8")
#>  14.     └─knitr::knit(knit_input, knit_output, envir = envir, quiet = quiet)
#>  15.       └─knitr:::process_file(text, output)
#>  16.         ├─base::withCallingHandlers(...)
#>  17.         ├─base::withCallingHandlers(...)
#>  18.         ├─knitr:::process_group(group)
#>  19.         └─knitr:::process_group.block(group)
#>  20.           └─knitr:::call_block(x)
#>  21.             └─knitr:::block_exec(params)
#>  22.               └─knitr:::eng_r(options)
#>  23.                 ├─knitr:::in_input_dir(...)
#>  24.                 │ └─knitr:::in_dir(input_dir(), expr)
#>  25.                 └─knitr (local) evaluate(...)
#>  26.                   └─evaluate::evaluate(...)
#>  27.                     └─evaluate:::evaluate_call(...)
#>  28.                       ├─evaluate (local) handle(...)
#>  29.                       │ └─base::try(f, silent = TRUE)
#>  30.                       │   └─base::tryCatch(...)
#>  31.                       │     └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>  32.                       │       └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>  33.                       │         └─base (local) doTryCatch(return(expr), name, parentenv, handler)
#>  34.                       ├─base::withCallingHandlers(...)
#>  35.                       ├─base::withVisible(value_fun(ev$value, ev$visible))
#>  36.                       └─knitr (local) value_fun(ev$value, ev$visible)
#>  37.                         └─knitr (local) fun(x, options = options)
#>  38.                           ├─base::withVisible(knit_print(x, ...))
#>  39.                           ├─knitr::knit_print(x, ...)
#>  40.                           └─rmarkdown:::knit_print.tbl_sql(x, ...)
#>  41.                             ├─context$df_print(x)
#>  42.                             └─dbplyr:::print.tbl_sql(x)
#>  43.                               ├─dbplyr:::cat_line(format(x, ..., n = n, width = width, n_extra = n_extra))
#>  44.                               │ ├─base::cat(paste0(..., "\n"), sep = "")
#>  45.                               │ └─base::paste0(..., "\n")
#>  46.                               ├─base::format(x, ..., n = n, width = width, n_extra = n_extra)
#>  47.                               └─pillar:::format.tbl(x, ..., n = n, width = width, n_extra = n_extra)
#>  48.                                 └─pillar:::format_tbl(...)
#>  49.                                   └─pillar::tbl_format_setup(...)
#>  50.                                     ├─pillar:::tbl_format_setup_dispatch(...)
#>  51.                                     └─pillar:::tbl_format_setup.tbl(...)
#>  52.                                       └─pillar:::df_head(x, n + 1)
#>  53.                                         ├─base::as.data.frame(head(x, n))
#>  54.                                         └─dbplyr:::as.data.frame.tbl_sql(head(x, n))
#>  55.                                           ├─base::as.data.frame(collect(x, n = n))
#>  56.                                           ├─dplyr::collect(x, n = n)
#>  57.                                           └─dbplyr:::collect.tbl_sql(x, n = n)
#>  58.                                             └─base::tryCatch(...)
#>  59.                                               └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#>  60.                                                 └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#>  61.                                                   └─value[[3L]](cond)
#>  62.                                                     └─cli::cli_abort("Failed to collect lazy table.", parent = cnd)
#>  63.                                                       └─rlang::abort(...)

# I expect the SQL translation to be:

#   SELECT *
#   FROM `dbplyr_001`
# WHERE ((`g` IS NULL))

# However, this is what happens: 
mf_filtered |> show_query()
#> <SQL>
#> SELECT *
#> FROM `dbplyr_001`
#> WHERE ((`g` IS ))

sessionInfo()
#> R version 4.4.0 (2024-04-24)
#> Platform: x86_64-redhat-linux-gnu
#> Running under: Red Hat Enterprise Linux 8.10 (Ootpa)
#> 
#> Matrix products: default
#> BLAS/LAPACK: /usr/lib64/libopenblaso-r0.3.15.so;  LAPACK version 3.9.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/Chicago
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] dbplyr_2.3.3 dplyr_1.1.3 
#> 
#> loaded via a namespace (and not attached):
#>  [1] vctrs_0.6.5       cli_3.6.2         knitr_1.44        rlang_1.1.3      
#>  [5] xfun_0.39         DBI_1.1.3         purrr_1.0.2       generics_0.1.3   
#>  [9] bit_4.0.5         glue_1.7.0        htmltools_0.5.7   fansi_1.0.5      
#> [13] rmarkdown_2.23    evaluate_0.23     tibble_3.2.1      fastmap_1.1.1    
#> [17] yaml_2.3.7        lifecycle_1.0.4   memoise_2.0.1     compiler_4.4.0   
#> [21] blob_1.2.4        RSQLite_2.3.1     fs_1.6.3          pkgconfig_2.0.3  
#> [25] rstudioapi_0.15.0 digest_0.6.35     R6_2.5.1          reprex_2.0.2     
#> [29] tidyselect_1.2.0  utf8_1.2.4        pillar_1.9.0      magrittr_2.0.3   
#> [33] bit64_4.0.5       tools_4.4.0       withr_3.0.0       cachem_1.0.8

Created on 2024-06-03 with reprex v2.0.2

@sjdshapiro
Copy link
Author

This is now resolved after updating (not sure if it was the R or package update or both that did it. My current session info where it is working is copied below.

R version 4.4.2 (2024-10-31)
Platform: x86_64-redhat-linux-gnu
Running under: Red Hat Enterprise Linux 8.10 (Ootpa)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblaso-r0.3.15.so;  LAPACK version 3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Chicago
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dbplyr_2.5.0

loaded via a namespace (and not attached):
 [1] utf8_1.2.4        R6_2.5.1          tidyselect_1.2.1  magrittr_2.0.3    glue_1.8.0        tibble_3.2.1      pkgconfig_2.0.3  
 [8] dplyr_1.1.4       generics_0.1.3    lifecycle_1.0.4   cli_3.6.3         fansi_1.0.6       vctrs_0.6.5       DBI_1.2.3        
[15] compiler_4.4.2    rstudioapi_0.17.1 tools_4.4.2       pillar_1.9.0      rlang_1.1.4      /

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant