Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Improvements in Home Page Processing #126

Closed

Conversation

TeachMeTW
Copy link
Contributor

@TeachMeTW TeachMeTW commented Sep 17, 2024

Enhancing the performance of the dashboard by optimizing UUID processing and refining database queries. These changes aim to reduce load times and improve the overall efficiency of the application.

Key Changes

  1. Removed Unnecessary UUID Conversion

    • Bypassed the conversion of UUID strings to UUID objects when not required, streamlining data processing.
  2. Optimized MongoDB Aggregation Pipeline

    • Combined multiple $match stages into a single stage to reduce pipeline overhead and accelerate query execution.
  3. Enhanced Logging ITemporary)

    • Implemented a log_execution_time decorator to monitor and log the execution time of key functions, aiding in performance tracking and debugging.

Benefits

  • Improved Performance: Reduced processing time by eliminating unnecessary operations and optimizing database queries.
  • Better Monitoring: Enhanced logging provides clear insights into function execution times, facilitating easier troubleshooting and performance tuning.

Testing

  • Functionality: Verified that all dashboard components display correctly and data is accurately processed.
  • Performance: Measured execution times before and after optimizations to confirm performance gains.
  • Logging: Ensured that execution times are correctly logged for all decorated functions.

@TeachMeTW
Copy link
Contributor Author

@shankari @JGreenlee I've notified Jack about this already, but right now on prod/staging, the initial overview load times are unbelievably slow, however, on local machine with open access dataset, the load times is only 4 seconds at its worst case. This PR won't really solve anything considering its not reflective of the actual issue. What are your thoughts on next steps?

@shankari
Copy link
Contributor

shankari commented Sep 18, 2024

@TeachMeTW this is why we need to test on staging 😄 We have found earlier that DocumentDB access is ~ 10x slower than mongo on the local laptop, even from an AWS EC2 instance. I am about to merge #124 now, let's see what the logging shows us around optimization before focusing on this change.

@TeachMeTW
Copy link
Contributor Author

Note the logger is a temporary until emission server func pipeline is completed

@TeachMeTW TeachMeTW marked this pull request as ready for review October 9, 2024 03:21
@TeachMeTW
Copy link
Contributor Author

TeachMeTW commented Oct 9, 2024

@shankari This has lowered home page loading from 7 sec to 0.5 seconds locally; awaiting staging to see if my fix actually works; the loggers are temporary per stated above. can be removed on a future comit or pr; added to project board as well


# Do we really need this?
# Looks like this takes the most time
# uuid_list = [UUID(npu) for npu in uuid_list]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's odd that this would be the bottleneck.
Are the entries of uuid_list already instances of UUID? Or are they strings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JGreenlee uuid_list is already a string; I don't see why we need to convert it to uuids when we are just querying it.

@shankari
Copy link
Contributor

shankari commented Oct 9, 2024

@TeachMeTW do we have the logs deployed to see the "before" performance? If not, I would suggest that we first add logging, so we can get some "before" data, and then deploy the fix for the "after" data

@TeachMeTW
Copy link
Contributor Author

TeachMeTW commented Oct 9, 2024

@shankari see #141 , would you like this implemented on other pages temporarily as well?

@shankari
Copy link
Contributor

I would like to fix e-mission/e-mission-server#986 properly
Then add instrumentation on all pages and access a ~ 5 programs (in addition to staging) to get a representative set of data
Then implement the fixes.

wrt

@shankari @JGreenlee I've notified Jack about this already, but right now on prod/staging, the initial overview load times are unbelievably slow, however, on local machine with open access dataset, the load times is only 4 seconds at its worst case. This PR won't really solve anything considering its not reflective of the actual issue. What are your thoughts on next steps?

We should make it easier to have the local machine be reflective of the real issues so we can focus on the right areas to optimize and not waste our time optimizing areas that are not the real bottlenecks.

To do this, I would suggest:

Please get the instrumentation done first, so that we can collect the before data and then use that to motivate these additional fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Tasks completed
Development

Successfully merging this pull request may close these issues.

3 participants