Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: create_iso8601 #101

Open
rammprasad opened this issue Oct 25, 2024 · 4 comments
Open

Bug: create_iso8601 #101

rammprasad opened this issue Oct 25, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@rammprasad
Copy link
Collaborator

What happened?

date_val <- c(NA, "15-Sep-2022", "17-Feb-21", "4-Oct-20", "20-Jan-20", "UN-UNK-1995", NA, "UN-UNK-21",
       "26-Jan-20", "28 Jan 2021", "12-Feb-20", "10-UNK-20", NA, NA)

create_iso8601(date_val, .format = c("d-m-y"), .na = c("UN", "UNK"))

problems()

There are two issues here:

  1. This example does not process the value 28 JAN 2021. When I change the function call to create_iso8601(date_val, .format = c("d-m-y", "dd MMM yyyy"), .na = c("UN", "UNK")), I receive an error: ! Number of vectors in ... should match length of .format.

  2. Missing values are reported as problems.

Expected Behavior:

  1. I should be able to provide more than one format for a vector. As long as the date matches one of the formats, the function should process it.

  2. Missing values should not be reported as problems.

cc - @ramiromagno

Session Information

No response

Reproducible Example

No response

@rammprasad rammprasad added the bug Something isn't working label Oct 25, 2024
@github-project-automation github-project-automation bot moved this to Product Backlog in sdtm.oak R package Oct 25, 2024
@ramiromagno
Copy link
Collaborator

Hi Ramm,

The missing values not being regarded as problems needs fixing... I believe that in one of our meetings it was decided this way... So it was a feature. :)

Regarding "28 JAN 2021", look carefully for the difference between my code and yours... And brush up again: https://pharmaverse.github.io/sdtm.oak/articles/iso_8601.html#multiple-alternative-formats.

library(sdtm.oak)
date_val <- c(
  NA,
  "15-Sep-2022",
  "17-Feb-21",
  "4-Oct-20",
  "20-Jan-20",
  "UN-UNK-1995",
  NA,
  "UN-UNK-21",
  "26-Jan-20",
  "28 Jan 2021",
  "12-Feb-20",
  "10-UNK-20",
  NA,
  NA
)

create_iso8601(date_val,
               .format = list(c("d-m-y", "d m y")),
               .na = c("UN", "UNK"))
#>  [1] NA           "2022-09-15" "2021-02-17" "2020-10-04" "2020-01-20"
#>  [6] "1995"       NA           "2021"       "2020-01-26" "2021-01-28"
#> [11] "2020-02-12" "2020---10"  NA           NA
problems()

Created on 2024-10-25 with reprex v2.1.1

@ramiromagno ramiromagno self-assigned this Oct 25, 2024
@ramiromagno
Copy link
Collaborator

create_iso8601() accepts multiple vectors as .... So, what should happen when the element of, say, a first vector (e.g. date) is NA, but the second vector (e.g. time) is non-missing but fails the parsing... should that be reported as a problem?

@rammprasad
Copy link
Collaborator Author

Issue 1 - My bad, I missed specifying the list as an option in the function call
Issue 2 - I see, it will be ideal not to provide a warning message for blank values. In reality, there will be a lot of blank values in the clinical trial trial data.

Lets just fix the second one.

@ramiromagno
Copy link
Collaborator

Issue 2: Given that create_iso8601() is designed as a multiple-value input function, when you say blank, do you mean blanks in all inputs? A blank in at least one of the inputs? How should blanks play with problems in non-blank values (is the result a problem or simply a non-problem blank)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Product Backlog
Development

No branches or pull requests

2 participants