Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency in mssql translation of if_else #1569

Open
edward-burn opened this issue Jan 5, 2025 · 1 comment
Open

Inconsistency in mssql translation of if_else #1569

edward-burn opened this issue Jan 5, 2025 · 1 comment

Comments

@edward-burn
Copy link
Contributor

Looking at if_else on different databases, @hadley @simonpcouch I think the translation for mssql might be somewhat inconsistent with respect to missing values. If the data was in r then in the below example a missing value would be returned if the variable was missing. The same behaviour is seen for most databases, but for mssql IIF is being used and doesn't return missing. Perhaps changing this to CASE WHEN like for the other dbms would make sense?

library(DBI)
#> Warning: package 'DBI' was built under R version 4.4.1
library(dplyr, warn.conflicts = FALSE)
library(dbplyr, warn.conflicts = FALSE)
library(duckdb)
#> Warning: package 'duckdb' was built under R version 4.4.2
packageVersion("dbplyr")
#> [1] '2.5.0'
data.frame(var = c("a", "b", NA)) |> 
  mutate(ifelse = if_else(var == "a", 1L, 2L),
         if_else = if_else(var == "a", 1L, 2L))
#>    var ifelse if_else
#> 1    a      1       1
#> 2    b      2       2
#> 3 <NA>     NA      NA

translate_sql(if_else(var == "a", 1L, 2L), 
              con = simulate_duckdb())
#> <SQL> CASE WHEN (`var` = 'a') THEN 1 WHEN NOT (`var` = 'a') THEN 2 END
translate_sql(if_else(var == "a", 1L, 2L), 
              con = simulate_postgres())
#> <SQL> CASE WHEN (`var` = 'a') THEN 1 WHEN NOT (`var` = 'a') THEN 2 END
translate_sql(if_else(var == "a", 1L, 2L), 
              con = simulate_redshift())
#> <SQL> CASE WHEN (`var` = 'a') THEN 1 WHEN NOT (`var` = 'a') THEN 2 END
translate_sql(if_else(var == "a", 1L, 2L), 
              con = simulate_mssql())
#> <SQL> IIF(`var` = 'a', 1, 2)

# duckdb
db <- dbConnect(duckdb::duckdb())
DBI::dbWriteTable(db, "test", data.frame(var = c("a", "b", NA)))
tbl(db, "test") |> 
  mutate(ifelse = if_else(var == "a", 1L, 2L),
         if_else = if_else(var == "a", 1L, 2L))
#> # Source:   SQL [?? x 3]
#> # Database: DuckDB v1.1.3 [eburn@Windows 10 x64:R 4.4.0/:memory:]
#>   var   ifelse if_else
#>   <chr>  <int>   <int>
#> 1 a          1       1
#> 2 b          2       2
#> 3 <NA>      NA      NA

# sql server
db <- DBI::dbConnect(odbc::odbc(),
                     Driver   = Sys.getenv("SQL_SERVER_DRIVER"),
                     Server   = Sys.getenv("CDM5_SQL_SERVER_SERVER"),
                     Database = Sys.getenv("CDM5_SQL_SERVER_CDM_DATABASE"),
                     UID      = Sys.getenv("CDM5_SQL_SERVER_USER"),
                     PWD      = Sys.getenv("CDM5_SQL_SERVER_PASSWORD"),
                     TrustServerCertificate="yes",
                     Port     = Sys.getenv("CDM5_SQL_SERVER_PORT"))
DBI::dbWriteTable(db, "tempdb.dbo.test", data.frame(var = c("a", "b", NA)), 
                  overwrite = TRUE)
tbl(db, "tempdb.dbo.test") |> 
  mutate(ifelse = if_else(var == "a", 1L, 2L),
         if_else = if_else(var == "a", 1L, 2L))
#> # Source:   SQL [?? x 3]
#> # Database: Microsoft SQL Server 15.00.4198[ohdsi@EC2AMAZ-QESHSOP/CDMV5]
#>   var   ifelse if_else
#>   <chr>  <int>   <int>
#> 1 a          1       1
#> 2 b          2       2
#> 3 <NA>       2       2

Created on 2025-01-05 with reprex v2.1.0

@hadley
Copy link
Member

hadley commented Jan 6, 2025

That's probably a good idea but it will need careful analysis because of the SQL server BIT vs BOOLEAN pain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants