-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change from pie charts to 100% stacked bar chart #123
Change from pie charts to 100% stacked bar chart #123
Conversation
iantei
commented
Feb 26, 2024
•
edited
Loading
edited
- Changes to transform single pie charts into collective 100% Stacked Bar Charts.
- Merge generic_metrics_sensed and mode_specific_metrics into generic_metrics notebook. Delete generic_metrics_sensed and mode_specific_metrics notebook, also from crontab.
… Count and Proportion.
…metrics notebook.
…data-sizex to 10 from 4 to adjust horizonatal length of the stacked bar charts. Update with new option values.
Comparing Laos-Chart comment to the above charts.
Overall, looks good. |
@shankari Dataset used:
|
UPDATE: @shankari I have lost access to the NREL system account. |
@iantei That is largely expected because there is no data for A bigger showstopper is the one that I highlighted here: #123 (comment) While I am fine with working on code structure, I was really really hoping to not have to tweak pandas horizontal bar generation and spacing. My hope was that the visualization intern(s) would make the charts look pretty and I only needed to make the code look pretty. |
I'm starting to catch up with these changes to hopefully be able to incorporate the surveys smoothly, and on this part of the showstopper mentioned above -
I think this is the plot itself and not part of the bar (this is the background for the chart area and you can see a faint white gridline) so if we want it removed for aesthetic reasons we can do that but I don't think it represents a value/needs to be in the legend |
I agree that it should not be in the legend. But I am not sure that it so easy to fix because we are computing the blocks to generate the horizontal bar. Is there a reason why we didn't just use pandas DataFrame.bar with stacked=True? Also, the second point in the showstopper (the legend for the labeled trips is too big and overlaps the sensed) will be even more true for the survey changes, correct? Do you have any thoughts on how to change that? Maybe a horizontal legend instead of vertical? |
I think that we'll put it horizontally, below the chart itself since the labels may have much longer text also, I anticipate this being the most difficult to get worked out visually when it comes to the surveys. As far as using pandas Dataframe.bar I'm not sure if there is any particular reason why we did it this way, I have a general sense that the matplotlib method is a little easier to customize, but I haven't tried the pandas method, maybe that would work. |
To be consistent with e-mission#86 (comment)
After improving error handing, testing done: set the date to one for which we have no data, including for the ones which showed a backtrace in #123 (comment) Both alt_text and alt_html files have been created for the newly created stacked bar charts, not for the older bar charts
|
In f7f3590, we added try/catch blocks for `AttributeError` and `pd.errors.UndefinedVariableError` to handle errors in pre-processing. However, apparently these were removed during subsequent changes/refactors. Since we have now moved the pre-processing to the cell in c16e4eb, this is even more important since errors are very likely outside of the plot function. In this change, we: - recreate the merged_debug_df to use in cells where we plot both labeled and unlabeled data - standardize the error handling by: - catching `KeyError` in addition to the others (consistent with observed behavior) - clearing the existing figure first so that we don't get two blank axes - plotting and generating the alt text with the same `plot_title_no_quality` - making sure to add alt_html to all of the except clauses for the new stacked bar charts This fixes e-mission#123 (comment) Testing done: e-mission#123 (comment)
After
|
In the heavy lift commit (12b00e3) there were several TODOs that I hoped would be handled by the public dashboard team. Since that wasn't possible due to internship timelines, I have handled at least one, user visible TODO by displaying the missing HTML as HTML instead of text. Testing done: e-mission#123 (comment)
The solution to have the legend items side by side is to use nCols: @Abby-Wheelis I'm thinking that we might want to keep the legend for the regular bar charts in the same place, but just increase the number of columns. For surveys, we may want to put the legend below the bar (to accommodate longer options) but still have only one column. We could specify that as an option to the plot function. Thoughts? |
A temporary fix for now is to pin the bbox to the bottom and not the top. I think that the real fix is to use ncols, because then it will work for both sensed and labeled legends. What if we have more than 5 sensed values? But we will need to revisit this anyway when we have inferred modes, and when we display surveys, and when we use pandas barh, so punting on this and implementing the temporary fix for now Testing done: e-mission#123 (comment)
We can't use pandas' barh because stacked doesn't see to work easily
In lieu of investigating this further, let's just try to fix the gray value. This is displayed to avoid I tried to set the xlim to 100, because we know that it will never be more than that. However, 100, 100.01 and 100.1 and 101 all returned the same error. Leaving this for the pandas investigation... |
One final fix, because the quality text annoys me and is confusing IMHO. Saying "For Labeled & Sensed: Based on X confirmed trips from Y users of X' trips from Y' users" is technically correct but requires cogitive load to determine which set of numbers goes with which bar, and what is labeled and what is confirmed. Going to try to fix that before declaring that this is done. |
The quality text is updated in the inferred changes! That puts it on each bar, where it says something like "Confirmed trips, 100 trips from 10 users 15%" or similar like this: comment - I agree that this quality text is confusing, I like what ended up on the inferred trips more, I think we had decided to hold off on that change in this PR to limit the changes here - but that was a while ago when the timeline wasn't as crunched |
…d participants ``` def get_quality_text_sensed(df, cutoff_text="", include_test_users=False): ``` now has a `cutoff_text` argument. However, the invocation from scaffolding was still ``` quality_text = get_quality_text_sensed(expanded_ct, include_test_users) ``` So the boolean `True` was interpreted as the cutoff_text and so the text that was generated was "Based on 2728 trips (True) from 13 users". After fixing that, this option seems to work. Testing done: e-mission#123 (comment)
Made the appropriate changes to all the plots, although I kept some single-bar plots unchanged Note that I had to remove the y-axis label "Trip Types" because: |
This makes is more clear what parts of the quality text correspond to which bar. - I left the commute plot untouched because there is only one bar, and recreating the quality text would be a bit complicated. - I also left the non stacked bar chart plots untouched Testing done: e-mission#123 (comment) e-mission#123 (comment)
This was fairly straightforward. Since this now introduces pre-processing into the cell, we also need to copy over the proper error handling. I have chosen to keep the quality text untouched here, since all of these are single-bar plots. Testing done: - Ran with and without data, notebook ran with no errors
I am going to declare that this PR is done, at least to the extent that I am able to spend time on it at this time. |
#128 (comment) does look cool! We should revisit the quality text construction code when we implement inferred modes. I am not super happy with that code. |
I have now tested with three different configurations:
All run without errors. @iantei has already tested the core code extensively before. Since I only restructured the code, I have tested that my changes generate the same results as his. I think this is ready to merge. |
I have been debating a bit on whether to squash merge this or just merge it with a merge commit. Let's go ahead and non-squash merge (we didn't squash merge for the UI rewrite either). But we should really be more careful about the plethora of commits in the future. |