Performance Improvements in User Stats Processing #124

TeachMeTW · 2024-09-09T21:45:21Z

Description

This pull request introduces several performance improvements to the data page:

Timer Integration: Added timing to measure the performance of the data page.
Lazy Loading: Added processing of UUIDs in batches of 10.

Changes Made

1. Timer Integration

Added timing to measure the execution time of the add_user_stats function and other functions.

2. Lazy Loading Implementation

Replaced sequential user processing with batch processing; batches can be set in add_user_stats in db_utils for default value changes or in render_content in data.py

Each batch of 10 takes about ~7 seconds of processing.

TeachMeTW · 2024-09-10T21:18:08Z

This PR would need a squash due to my extensive commits

shankari · 2024-09-12T18:54:04Z

@JGreenlee @TeachMeTW I looked at this a while ago and came up with a plan (or two) but didn't have time to implement. The key here is to really dig into the user model and understand that the user stats consists of two separate parts:

the user profile: which is stored as a single table
the user stats: which are computed from the trip table (e.g. % labeled, last trip, first trip, etc)

Reading the first is super fast, because it is a single small table
Reading the second is very slow because we have to essentially query for the entire trip table

We should fix this by:

lazy loading the user stats part: I looked up plotly and it is possible to update data as new fields are computed (e.g. something like https://community.plotly.com/t/extend-or-append-data-instead-of-update/8898/4). I might have filed an issue for this as well but can't find it now.
pre-computing the user stats and also storing them in the user profile: ideally we would do this as part of the pipeline OR a separate script that runs once a day. Note that we will also need a backwards compat script if we choose this.

shankari · 2024-09-12T18:58:15Z

Ah I think it was Patch that I originally found for lazily updating data
https://plotly.com/blog/partial-properties/

TeachMeTW · 2024-09-12T19:38:24Z

@shankari I made progress in regards to the lazy loading, the only thing I'd need to figure out is how to stop it from refreshing the ui. I tried to use Patch() but seems to not work fully as intended yet; I'll keep at it.

See below:

8mb.video-WNJ-MZ7G6LG7.mp4

TeachMeTW · 2024-09-13T20:48:19Z

@shankari Progress, discussed with @JGreenlee to help resolve some of the issues I was facing; now loads without reloading the entire thing.

8mb.video-P0Y-tuqh2oNa.mp4

shankari

This seems to work, and I want to get it out to staging ASAP so that we can test and move to production (and unblock everybody). We can always return and polish later.

But I do have some questions for me to understand your changes better

app_sidebar_collapsible.py

requirements.txt

shankari · 2024-09-18T02:30:15Z

pages/data.py

+            # Create a Patch object to append data progressively
+            patched_data = Patch()
+            patched_data['data'] = processed_data
+


I don't understand how this is used. I see that the patched_data object is created here, but I don't see it used anywhere else in this PR. I even see that line 129 references this Patch object in a comment, but I don't see any of the Patch object methods, such as append.

Are we actually using patch? If not, what are we doing for lazy loading?

I believe this was a relic of a prior iteration I made; I can remove/clean this section in another PR.

utils/db_utils.py

shankari · 2024-09-18T02:41:01Z

Squash merging this as well, make sure to pull the changes before starting on the next PR, @TeachMeTW

* Removed artifacts * Removed artifacts

Reverted #124 and Added Logging

Revert "Reverted #124 and Added Logging"

TeachMeTW added 9 commits September 6, 2024 16:57

Polars of 2 DB Utils Function

a7c7014

Cut Load Time in half and added timers

e0e2da6

Delete docker-compose-dev.yml.bak

406e3bf

Added Timers

efb9436

commit fix

3474dbd

commit fix

c6c5235

update gitignore

ee62b79

update

51d18ff

update

8c95c36

TeachMeTW added 3 commits September 11, 2024 16:36

batch loading

dc93c35

Batch Loading

cd84641

Removed Polars, Fixed Trajectory bug

0e8d15c

TeachMeTW added 3 commits September 12, 2024 12:44

Batch

8617d6d

Loads but refreshes

26085a1

Worksgit add .

8c8faa2

TeachMeTW added 4 commits September 16, 2024 21:33

Reverted Changes for new pr

c1a30e5

Fix

95f2c66

Fix

d1715e5

Req being weird

f73cd4d

TeachMeTW marked this pull request as ready for review September 17, 2024 04:40

Revert last line change

c6fb3d0

shankari mentioned this pull request Sep 18, 2024

Performance Improvements in Home Page Processing #126

Closed

shankari changed the base branch from master to upgrade_dependencies September 18, 2024 02:34

shankari changed the base branch from upgrade_dependencies to master September 18, 2024 02:34

shankari approved these changes Sep 18, 2024

View reviewed changes

shankari merged commit b9b0c34 into e-mission:master Sep 18, 2024

This was referenced Sep 19, 2024

Refactor and Improvements to #124 #127

Merged

Implement dynamic loading rather than static intervals on data page #128

Open

shankari pushed a commit that referenced this pull request Sep 22, 2024

Refactor and Improvements to #124 (#127)

b6e20ae

* Removed artifacts * Removed artifacts

shankari mentioned this pull request Oct 20, 2024

Collect data on performance improvements to the dashboard #145

Open

TeachMeTW added a commit to TeachMeTW/op-admin-dashboard that referenced this pull request Oct 21, 2024

Reverted e-mission#124 and Added Logging

71f27ed

shankari added a commit that referenced this pull request Oct 21, 2024

Merge pull request #146 from TeachMeTW/Revert_#124

5bb7b8e

Reverted #124 and Added Logging

shankari added a commit that referenced this pull request Oct 21, 2024

Revert "Reverted #124 and Added Logging"

ed3177c

shankari added a commit that referenced this pull request Oct 21, 2024

Merge pull request #148 from e-mission/revert-146-Revert_#124

ff6adc3

Revert "Reverted #124 and Added Logging"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Improvements in User Stats Processing #124

Performance Improvements in User Stats Processing #124

TeachMeTW commented Sep 9, 2024 •

edited

Loading

TeachMeTW commented Sep 10, 2024

shankari commented Sep 12, 2024 •

edited

Loading

shankari commented Sep 12, 2024

TeachMeTW commented Sep 12, 2024

TeachMeTW commented Sep 13, 2024

shankari left a comment

shankari Sep 18, 2024

TeachMeTW Sep 18, 2024

shankari commented Sep 18, 2024

Performance Improvements in User Stats Processing #124

Performance Improvements in User Stats Processing #124

Conversation

TeachMeTW commented Sep 9, 2024 • edited Loading

Description

Changes Made

1. Timer Integration

2. Lazy Loading Implementation

TeachMeTW commented Sep 10, 2024

shankari commented Sep 12, 2024 • edited Loading

shankari commented Sep 12, 2024

TeachMeTW commented Sep 12, 2024

TeachMeTW commented Sep 13, 2024

shankari left a comment

Choose a reason for hiding this comment

shankari Sep 18, 2024

Choose a reason for hiding this comment

TeachMeTW Sep 18, 2024

Choose a reason for hiding this comment

shankari commented Sep 18, 2024

TeachMeTW commented Sep 9, 2024 •

edited

Loading

shankari commented Sep 12, 2024 •

edited

Loading