Adding Survey Responses to Public Dashboard #124

Abby-Wheelis · 2024-03-15T01:06:29Z

Starting to add the survey responses to the public dashboard - want to allow for any survey configuration, including multiple conditional surveys, see discussion in the issue

notebook currently uses the survey_info part of the config uses config to get the surveys, and use the xlsx files to translate between data and readable labels generates a plot for every question present in the data hopefully will work with multiple surveys, but only tested with 1 so far

starting to read via xml instead of spreadsheet, more support across languages

there was a bug (duplicate code) in how I was creating the dictionaries

We only want to display question that are still in the survey, as those are the questions we will actually have a response for

added code to the index.html file to fetch surveys, add each of the options for the survey questions, and display the charts by default on the dashboard

use labels except for likert, use values

uncovered a few bugs

These will never have answers, so no need to chart them. Example was "Please rate the following statements: "

all question names begin with a capital letter, so we can safely drop all columns that begin lowercase had to change from a hardcoded list because different survey data had different extra columns, and we need this to be as general as possible to prevent extra maintanence in the future

now properly showing the debug df and the alt text on all charts - including the ones that don't have enough data

before, I was dropping all columns that were not survey questions no longer needed because we are generating charts directly from the list of questions (not the data) - this also matches better with the frontend!

Filter off the "input" type questions -- such as "other, please specify" so that we could choose the best way to display them later

In order to properly generate the quality text, we need to know how many trips a particular survey was presented for To accomplish this, we can use the strings in the config that determine what survey to show the user (or a python version thereof) We need to use the composite trips to have the sections, and perform some other data manipulations in order to have this work properly Next is finding all of the responses that are actually present for each survey ... and associating that with each question ...

Each survey will have a different "denominator" of quality text which should be presented to the user to represent what percentage of total eligible trips have survey responses using "showsIfPy" as a backup for strings that are too complicated, changing access notation for dictionaries in e-mission/nrel-openpath-deploy-configs#88, and replacing && and ! allows us to evaluate the strings in python needed to track the survey names in order to know which denominator to use, so turned the list of urls into a dictionary instead of a list, and turned the dictionary of questions into a dictionary of dictionaries, one for each question

if there is only one survey, it isn't conditional, so the denominator is all composite trips

when preparing to evaluate the string, I had pulled individual values as a workaround using row.to_dict() instead is more generalizable

editing this filtering process to explicitly pass the eval string rather than rely on the fact the function shares scope with where it is called - for clarity

if there is no data - we catch and create empty dataframes

removed empty cell and changed "None" inputs to dummy values

removing these from the html in the same way I did in the notebook -- maybe we would add these back and a different chart type later

polishing the way quality text is handled

Even though there are no labels, we can still display charts with sensed mode and other generic metrics

emulating the comment for the study case

similar to the mode-specific notebooks, we ONLY want to run this notebook if the trip labels are surveys (ENKETO)

Abby-Wheelis · 2024-05-07T14:12:10Z

Some of the charts are turning out a little squished - the only change I've made is to increase their bounding boxes to 10x4, which seems to be what the other stacked bar charts get, but I'll investigate that further

make stacked bar plots larger, add the surveys to the list of stacked metrics so they can have "more info" buttons

Abby-Wheelis · 2024-05-07T17:46:08Z

Starting to test ... most of the baseline charts are failing to generate ... going back to the notebooks to try and reproduce ...

Also failing in the notebooks ... in generic_metrics the only charts that generated were number of trips by day and number of trips by weekday - the rest either have an error displayed or a debug df table, but there is data, shouldn't the charts be showing up with just sensed?

Actually, to answer my own question, maybe not - the sensed mode is different (bluetooth), so we may need a workaround to introduce it ... but we do have the data (see below) - we'll want to update this to be the matched mode, but maybe that can be a next step

So maybe it's the "labels"? We have user input but not traditional labels ...

expanded_ct.groupby("Mode_confirm").agg({distance_col: 'count'} throws aKeyError, which is why we show the debug dataframe ... I bet the single bar could be plotted. Yep, commented out the line to plot the labeled data and the sensed data plots just fine.

So, there is an issue with all the charts that expect labeled data not being able to plot the sensed data because of an error with the labeled data. We have a couple choices here:

remove the charts from display for survey configs (for now) until we debug & add the bluetooth sensed modes
debug now so that the plain sensed modes show
debug & add the bluetooth modes ... can we just replace the primary_mode with the primary sensed mode?

If I can do it quickly, I think just debugging is the best plan, given that the bluetooth sensed mode is also buggy (why we took it out of the quality text for now) - else just cut all broken charts for the moment being. @shankari what do you think?

shankari · 2024-05-07T17:57:29Z

@Abby-Wheelis it might be more meaningful to create a separate notebook for the fleet case. We won't have mode_confirm, and also the sensed mode for the BLE is stored separate (ble_sensed_mode instead of sensed_mode). So both the bars in the standard notebooks would be different. We can just change the index.html to show different charts if it in "fleet mode" We should revisit this once we start incorporating bluetooth into regular OpenPATH, but that should be good enough for an alpha version.

To plot the `ble_sensed_mode,

can we just replace the primary_mode with the primary sensed mode?

Yes, as in, you would need to compute the primary_ble_sensed_mode like we compute the primary mode now, but then you can just use it in the same way.

We may want to also group the bars differently for the fleet case - since we won't have labeled versus confirmed, we may want to do trip count versus distance. So basically only one figure but with two bars.

removed stale lines that were breaking notebook syntax!

Abby-Wheelis · 2024-05-07T18:10:46Z

it might be more meaningful to create a separate notebook for the fleet case. We won't have mode_confirm, and also the sensed mode for the BLE is stored separate (ble_sensed_mode instead of sensed_mode). So both the bars in the standard notebooks would be different. We can just change the index.html to show different charts if it in "fleet mode"

I think I could handle the dfc-fermata changes this way, but it would still be broken for all other survey cases (likewashingtoncommons). Maybe a metrics notebook for survey cases - which handled both fleet and non-fleet cases (difference would be displaying sensed mode or ble sensed mode) would be best?

We may want to also group the bars differently for the fleet case - since we won't have labeled versus confirmed, we may want to do trip count versus distance. So basically only one figure.

I think this would work, we might face duplicating or losing some of the charts, though:

number of trips + distance of trips + distance (land [omit in fleet case?]) - 1 chart
loose trips under 80%

That's fairly minimal loss and fairly minimal duplication, that feels OK to me, will proceed with this

shankari · 2024-05-07T18:13:38Z

loose trips under 80%

We can retain trips under 80% and land trips as two more figures with count and distance. I am not sure they are as meaningful for the fleet case, but might be for the other survey cases (like washington commons)

Abby-Wheelis · 2024-05-07T20:01:15Z

Three charts for the sensed modes: count/distance, under 80%, and land modes:

We may want to refine the text on these axis/titles in particular - going to pipe through to the frontend first

create a dedicated notebook for handling metrics on survey deployments currently option to show the "bluetooth sensed" mode is commented out -- this just shows a lot of unkowns, would this one day be the exact vehicles and how much they are used? or just car/ecar?

naming the charts from the specific notebook so they don't get overwritten by the other nbs with a _survey suffix

errors encountered while testing

Abby-Wheelis · 2024-05-07T23:27:19Z

Testing done with dfc-fermata (May 6th) dataset - can run everything from the command line with no breaking!

When dashboard loads:

Non-survey metrics:

More survey examples:

More info examples:

Surveys for May are showing the debug df since there are no responses

Testing with washingtoncommons as well, getting 1 weird behavior:

The last segment is failing to plot for some reason ....

Abby-Wheelis · 2024-05-07T23:33:03Z

Ah, it was the Other color not present in the map for survey responses - patch incoming

Abby-Wheelis · 2024-05-07T23:49:26Z

fixed!

shankari

I am going to fix a couple of the issues and then merge. Let's fix the other issues in a cleanup PR.

shankari · 2024-05-08T05:55:29Z

viz_scripts/scaffolding.py

@@ -69,6 +69,16 @@ def load_all_participant_trips(program, tq, load_test_users):
    disp.display(participant_ct_df.head())
    return participant_ct_df

+def filter_composite_trips(all_comp_trips, program, load_test_users):


Now that you have removed load_composite_trips, you can also remove filter_composite_trips

I see that filter_composite_trips is not identical to filter_labeled_trips because it doesn't filter by blank user input. However, I don't understand why you are not filtering by blank user input. It should be possible to just call load_all_participant_trips instead of load_all_confirmed_trips followed by filter_composite_trips to achieve the same result.

However, I don't understand why you are not filtering by blank user input.

I do, later, but I need all of the trips in order to create the "all trips for which a survey was prompted" information

I made some changes to the data loading (including pushing it to scaffolding) in #135 - I have the filtered, unfiltered, and file suffix all returned from a function in scaffolding now - I don't have the debug_df or quality_text included since it varies chart to chart

viz_scripts/docker/environment36.dashboard.additions.yml

shankari · 2024-05-08T06:20:28Z

viz_scripts/survey_responses.ipynb

+    "tq = scaffolding.get_time_query(year, month)\n",
+    "all_confirmed_trips = scaffolding.load_all_confirmed_trips(tq)\n",
+    "#we need to filter out trips (based on if including test users)\n",
+    "all_confirmed_trips = scaffolding.filter_composite_trips(all_confirmed_trips, program, include_test_users)\n",
+    "\n",


I think this should all go into scaffolding (similar to load_viz_notebook_sensor_inference_data)

shankari · 2024-05-08T06:30:06Z

viz_scripts/survey_responses.ipynb

+    "#merge any cols with the same name into 1 col -- should have different values in their survey_name col\n",
+    "#https://stackoverflow.com/questions/24390645/python-pandas-merge-samed-name-columns-in-a-dataframe\n",
+    "def sjoin(x): return ';'.join(x[x.notnull()].astype(str))\n",
+    "df_responses = df_responses.groupby(level=0, axis=1).apply(lambda x: x.apply(sjoin, axis=1))"


what is this for? If the responses are in different surveys, should they be combined into a single column?

I see this was in #124 (comment) but I don't fully understand why the column is duplicated. Shouldn't the survey id be part of the column name as well?

As it stands, the survey id is not part of the column name - I could refactor to include it as a part of the column name and update the way that I translate the column names to be the chart titles

They should end up in different charts since we group and filter by the survey_name, which is its own column

viz_scripts/docker/crontab

As part of e-mission/e-mission-server#965 and e-mission/e-mission-server@d33ce2e

We do not need to manually install em-common since it was already installed in the base server image as part of e-mission/e-mission-server#965 and e-mission/e-mission-server@d33ce2e So we removed it in 334c95b But we do need to bump up the base image to include it

- use `.get` to check whether this is an enketo survey so that it works for older deployments that predate the `survey_info` functionality as well. This has no functional difference since, if there is no `survey_info`, this is not a survey. But we get a better exception and avoid confusion later. - improve error handling for survey_responses, by splitting the attribute and name errors from the more complex errors

shankari · 2024-05-08T08:38:13Z

Testing done:

metrics work
survey responses don't work, but that is because there are only TripConfirm entries in the dataset ?!

Abby-Wheelis · 2024-05-08T15:42:28Z

survey responses don't work, but that is because there are only TripConfirm entries in the dataset ?!

What dataset were you using? 336 trips is a lot less than the 990 I have in the May 8th dataset, it it's old enough I would expect only to see TripConfirm survey entries, since I believe that is what we were using for our very preliminary internal testing - then we went to Roam/Return/Gas around the end of March

Abby-Wheelis · 2024-05-08T15:51:01Z

This broke normal deployments on staging!!!

I'll work on a cleanup for this issue and the points mentioned above

Abby Wheelis added 30 commits March 13, 2024 11:57

add quality text to pie charts

a526257

revise the way surveys are read

c4cfdfa

starting to read via xml instead of spreadsheet, more support across languages

update dictionary building

427f44b

there was a bug (duplicate code) in how I was creating the dictionaries

only display questions currently in the survey

94f48c5

We only want to display question that are still in the survey, as those are the questions we will actually have a response for

connect charts to frontend

7f3d477

added code to the index.html file to fetch surveys, add each of the options for the survey questions, and display the charts by default on the dashboard

translate answers, drop zero other

74768f7

use labels except for likert, use values

updates while working with washingtoncommons

99e0cc8

uncovered a few bugs

add check to prevent showing "label" questions

eb14800

These will never have answers, so no need to chart them. Example was "Please rate the following statements: "

clean up some print statements

3a3533d

fix failed charts and alt text

dc3bab8

now properly showing the debug df and the alt text on all charts - including the ones that don't have enough data

no need to drop columns

ea0c660

before, I was dropping all columns that were not survey questions no longer needed because we are generating charts directly from the list of questions (not the data) - this also matches better with the frontend!

sensed instead of labeled trips by mode

157ad15

pull out "input" type questions

e6c7fe6

Filter off the "input" type questions -- such as "other, please specify" so that we could choose the best way to display them later

bypass evals for non-conditional surveys

7d8a114

if there is only one survey, it isn't conditional, so the denominator is all composite trips

flatten entire row instead of pulling values

8a2c2f4

when preparing to evaluate the string, I had pulled individual values as a workaround using row.to_dict() instead is more generalizable

add eval string as a parameter to filter fncn

a04a316

editing this filtering process to explicitly pass the eval string rather than rely on the fact the function shares scope with where it is called - for clarity

add the survey info parameter

af3dfa8

add new notebook to crontab

399c7b3

deal with missing data

75d4fb6

if there is no data - we catch and create empty dataframes

clean up notebook

45026b7

removed empty cell and changed "None" inputs to dummy values

remove old code

733e559

no "input" type questions in html

8a4869b

removing these from the html in the same way I did in the notebook -- maybe we would add these back and a different chart type later

edits to quality text

df6fde3

polishing the way quality text is handled

add sensed metrics to survey dashboards

d6b2d80

Even though there are no labels, we can still display charts with sensed mode and other generic metrics

comments for every case of dashboard

0da6c33

emulating the comment for the study case

break survey notebook if irrelevant

8ff0b8b

similar to the mode-specific notebooks, we ONLY want to run this notebook if the trip labels are surveys (ENKETO)

update frontend for metrics

6c38e97

make stacked bar plots larger, add the surveys to the list of stacked metrics so they can have "more info" buttons

remove olde code

a57a30f

removed stale lines that were breaking notebook syntax!

Abby-Wheelis added 7 commits May 7, 2024 14:23

add survey metrics to the front end

6957500

naming the charts from the specific notebook so they don't get overwritten by the other nbs with a _survey suffix

remove testing-only code

2e2e269

use correct debug frame

f9f3ba5

restore default parameters

3d4c5fb

add survey metrics to list of stacked

40c1478

handle empty dataframes better

6a36770

errors encountered while testing

Abby-Wheelis added 2 commits May 7, 2024 17:33

handle "Other" values in color map

41db5cf

tidy code

fb7bc70

Abby-Wheelis marked this pull request as ready for review May 7, 2024 23:51

shankari requested changes May 8, 2024

View reviewed changes

shankari added 3 commits May 7, 2024 23:36

🔥 Remove git and em-common since they are in the server repo already

334c95b

As part of e-mission/e-mission-server#965 and e-mission/e-mission-server@d33ce2e

shankari merged commit 5aa3434 into e-mission:main May 8, 2024

Abby-Wheelis mentioned this pull request May 8, 2024

🔥 🧹 Bugfix and cleanup for "Adding Surveys" #135

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Survey Responses to Public Dashboard #124

Adding Survey Responses to Public Dashboard #124

Abby-Wheelis commented Mar 15, 2024

Abby-Wheelis commented May 7, 2024

Abby-Wheelis commented May 7, 2024

shankari commented May 7, 2024 •

edited

Loading

Abby-Wheelis commented May 7, 2024

shankari commented May 7, 2024

Abby-Wheelis commented May 7, 2024

Abby-Wheelis commented May 7, 2024

Abby-Wheelis commented May 7, 2024

Abby-Wheelis commented May 7, 2024

shankari left a comment

shankari May 8, 2024

shankari May 8, 2024

Abby-Wheelis May 8, 2024

Abby-Wheelis May 8, 2024

shankari May 8, 2024

shankari May 8, 2024

shankari May 8, 2024

Abby-Wheelis May 8, 2024

shankari commented May 8, 2024

Abby-Wheelis commented May 8, 2024

Abby-Wheelis commented May 8, 2024

Adding Survey Responses to Public Dashboard #124

Adding Survey Responses to Public Dashboard #124

Conversation

Abby-Wheelis commented Mar 15, 2024

Abby-Wheelis commented May 7, 2024

Abby-Wheelis commented May 7, 2024

shankari commented May 7, 2024 • edited Loading

Abby-Wheelis commented May 7, 2024

shankari commented May 7, 2024

Abby-Wheelis commented May 7, 2024

Abby-Wheelis commented May 7, 2024

Abby-Wheelis commented May 7, 2024

Abby-Wheelis commented May 7, 2024

shankari left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shankari commented May 8, 2024

Abby-Wheelis commented May 8, 2024

Abby-Wheelis commented May 8, 2024

shankari commented May 7, 2024 •

edited

Loading