Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent error: "duplicated values in labels" #104

Closed
tgravelle opened this issue Feb 14, 2023 · 5 comments
Closed

Intermittent error: "duplicated values in labels" #104

tgravelle opened this issue Feb 14, 2023 · 5 comments

Comments

@tgravelle
Copy link

Hello!

In using expss to tabulate survey data, I am receiving the error message below intermittently. By "intermittent" I mean that when this error occurs, I can simply re-run the example code (also below) and often obtain the desired tabulation. For context, all of the columns in data.2 being tabulated are factors (I am not at liberty to share the data).

Error in set_val_lab.default(x, value, add = FALSE) : 
  'set_val_lab' - duplicated values in labels:
banner_tables <- data.2 %>%
  tab_weight(weight = weight) %>%
  tab_total_row_position("above") %>%
  tab_cols(total(), gender, age, education, race.ethn, region) %>%
  tab_cells(Q21.0 %to% Q21.11) %>% tab_stat_cpct() %>%
  tab_cells(Q22.0 %to% Q22.13) %>% tab_stat_cpct() %>%
  tab_cells(Q23.0 %to% Q23.6) %>% tab_stat_cpct() %>%
  tab_cells(Q24.0, Q24.1, Q24.2) %>% tab_stat_cpct() %>%
  tab_cells(Q25.0, Q25.1, Q25.2) %>% tab_stat_cpct() %>%
  tab_pivot() %>%
  as_tibble()

This is rather a strange behavior that I haven't encountered with expss before. Why would unchanged code produce varying results -- sometimes returning an error message and sometimes returning the desired tabulation?

Thank you in advance for any insight on what might be causing this.

@gdemin
Copy link
Owner

gdemin commented Feb 15, 2023

Hello!
Where do you get your data from? Which function do you use to load the data?
Could you apply the function below to your dataset? It will detect factors with duplicated levels:

dupl_levels = function(df){
    res = lapply(df, attr, 'levels')
    dupl_levels = sapply(res, anyDuplicated)
    res[dupl_levels>0]
}

And could you try to detect the column which caused the error? To do this you need to remove variables from your code for the table one by one and find after which variable the code will stop raising an error.

@tgravelle
Copy link
Author

Thank you for your reply. I'm reading in an SPSS .sav dataset exported from IncQuery using haven::read_sav(). I've previously done the same with .sav files from other sources without issue. Your function does not identify any duplicated factor levels in my data (it returns a list of length 0).

I also have no way of reliably determining which column is throwing an error because the error occurs intermittently: the same code will sometimes run and sometimes not not run -- without making any changes to the code.

@gdemin
Copy link
Owner

gdemin commented Feb 18, 2023

Could you provide the result of str on each variable which is used in your code for the table?

@gdemin
Copy link
Owner

gdemin commented Mar 27, 2023

The same as #107.
Could you run the sessionInfo() on your system and place the results here?

@gdemin
Copy link
Owner

gdemin commented Jun 18, 2023

The same as #107. Closed as duplicated.

@gdemin gdemin closed this as completed Jun 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants