Update 02-wrangle.Rmd

Added a new section to explain how to format a user's already wranged tibble.
yhoriuchi · Aug 15, 2023 · 63b2988 · 63b2988
1 parent 3b452ef
commit 63b2988
Showing 1 changed file with 53 additions and 16 deletions.
diff --git a/vignettes/02-wrangle.Rmd b/vignettes/02-wrangle.Rmd
@@ -19,7 +19,6 @@ library(projoint)
 
 ### 2.2 Read and wrangle data
 
-
 #### With the flipped repeated tasks
 
 Let's look at a simple example. We expand all those arguments below for clarity:
@@ -80,21 +79,21 @@ The `.fill` argument is logical: TRUE if you want to use information about wheth
 You can see the difference by comparing the following two:
 ```{r}
 fill_FALSE <- reshape_projoint(.dataframe = exampleData1, 
-                 .idvar = "ResponseId", 
-                 .outcomes = outcomes1,
-                 .outcomes_ids = c("A", "B"),
-                 .alphabet = "K", 
-                 .repeated = TRUE,
-                 .flipped = TRUE, 
-                 .fill = FALSE)
+                               .idvar = "ResponseId", 
+                               .outcomes = outcomes1,
+                               .outcomes_ids = c("A", "B"),
+                               .alphabet = "K", 
+                               .repeated = TRUE,
+                               .flipped = TRUE, 
+                               .fill = FALSE)
 fill_TRUE <- reshape_projoint(.dataframe = exampleData1, 
-                 .idvar = "ResponseId", 
-                 .outcomes = outcomes1,
-                 .outcomes_ids = c("A", "B"),
-                 .alphabet = "K", 
-                 .repeated = TRUE,
-                 .flipped = TRUE, 
-                 .fill = TRUE)
+                              .idvar = "ResponseId", 
+                              .outcomes = outcomes1,
+                              .outcomes_ids = c("A", "B"),
+                              .alphabet = "K", 
+                              .repeated = TRUE,
+                              .flipped = TRUE, 
+                              .fill = TRUE)
 ```
 We just select the essential variables only. The first data frame includes the values for the `agree` variable (whether the same profile was chosen or not) only for the repeated task. The second data frame fills the missing values for the other non-repeated tasks.
 ```{r}
@@ -105,7 +104,45 @@ fill_TRUE@data[selected_vars]
 ```
 If the number of respondents is small, if the number of specific profile pairs of your interest is small, and/or if the number of specific respondent subgroups you want to study is small, it is worth changing this option to TRUE. But please note that `.fill = TRUE` is based on an assumption that IRR is independent of information contained in conjoint tables. Although our empirical tests suggest the validity of this assumption, if you are unsure about it, it is better to use the default value (FALSE).
 
-### 2.4 Arrange the order and labels of attributes and levels
+### 2.4 Read your already-wrangled tibble
+
+You may have already read the original data downloaded from Qualtrics, load it to R, and wrangle data to make a data frame (or tibble) ready of analysis. In such a case, you you use  `make_projoint_data()` to save your data as a "projoint_data" class object necessary to use `projoint()`. Here is an example. First, load your data frame. 
+```{r}
+data <- exampleData1_labelled_tibble
+```
+It should looks like the following. Each row should correspond to each of two profiles in each task for each respondent. The data frame should have columns indicating (1) each respondent's ID, (2) task number, (3) profile number, and (4) a column recording each response (0, 1) for each task. If your design includes the repeated task, it should also include a column recording the response for the repeated task. 
+```{r}
+data
+```
+
+Next, make a character vector of your attributes. 
+```{r}
+attributes <- c("School Quality",
+                "Violent Crime Rate (Vs National Rate)",
+                "Racial Composition",
+                "Housing Cost",
+                "Presidential Vote (2020)",
+                "Total Daily Driving Time for Commuting and Errands",
+                "Type of Place")
+```
+Then, make a suitable object for the next steps using `make_projoint_data()`. The default variable names are shown below. If your data frame uses different names, you can change them. 
+```{r, message=FALSE}
+out4 <- make_projoint_data(.dataframe = data,
+                           .attribute_vars = attributes, 
+                           .id_var = "id", # the default name
+                           .task_var = "task", # the default name
+                           .profile_var = "profile", # the default name
+                           .selected_var = "selected", # the default name
+                           .selected_repeated_var = "selected_repeated", # the default is NULL
+                           .fill = TRUE)
+```
+The output will be the same as the output of `fill_FALSE` in the previous section.
+```{r}
+out4
+```
+
+
+### 2.5 Arrange the order and labels of attributes and levels
 
 The reshaped data have attributes and levels that are sorted alphabetically. Often, however, you want to reorder the attributes and/or order the levels of a particular attribute. You may also prefer not to use the actual labels for attributes and levels used in your conjoint experiments; for example, for the purpose of presentation, you may want to make them shorter. This process has been challenging for applied scholars using other packages. We make this process easy. You first save the labels using `save_labels()`. In the CSV file you save in your local computer, you should revise the column named `order` to specify the order of attributes and levels you want to display in your figure. You can also revise the labels for attributes and levels in any way you like. But *you should not make any change to the first column named `level_id`*. After saving the updated CSV file, you should use `read_labels()` to read it and save the object suitable for the next step (i.e., use `projoint()`).