-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: extremely slow paginated conversation list query [WPB-11808] #3098
fix: extremely slow paginated conversation list query [WPB-11808] #3098
Conversation
Datadog ReportBranch report: ✅ 0 Failed, 3151 Passed, 107 Skipped, 35.41s Total Time |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## release/candidate #3098 +/- ##
=====================================================
+ Coverage 52.61% 52.64% +0.02%
=====================================================
Files 1320 1321 +1
Lines 51523 51615 +92
Branches 4779 4781 +2
=====================================================
+ Hits 27111 27173 +62
- Misses 22451 22486 +35
+ Partials 1961 1956 -5
... and 9 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quality Gate passedIssues Measures |
…3102) * fix: extremely slow paginated conversation list query [WPB-11808] (#3098) * fix: extremely slow paginated conversation list query [WPB-11808] * fix name * handle case of moving messages to other conversation * add tests * trigger build --------- Co-authored-by: Michał Saleniuk <[email protected]> Co-authored-by: Michał Saleniuk <[email protected]>
PR Submission Checklist for internal contributors
The PR Title
SQPIT-764
The PR Description
What's new in this PR?
Issues
The paginated conversation list query is unable to show the list in a reasonable time, if someone has many conversations and messages then it can take multiple minutes.
Causes (Optional)
Left joining messages with multiple columns from different content tables and left joining unread events to just get last message for each conversation and count of different unread events is too far from optimal. We have managed to reduce full table scans but there is still a lot of data to work on by this query.
Solutions
To have a proper data to work on, I filled a database with ~1150 conversations and over 200 000 messages, biggest conversation had around 35 000 messages and 27 000 unread events.
With this DB, I waited more than 10 minutes using the current paginated query to get the result and eventually gave up. So I checked various ideas, below are the ones that looked most promising.
For unread events table, there are two ways: to left join all unread events into main query and calculate it or to make a subquery or view to first count events on just that table and then left join results of that calculations into the main query.
For message table, first idea was to match the solution implemented some time ago and to make a single subquery to get ids of last messages for each conversation and then left join it with all contents and other things - thanks to that it doesn't left join all messages and their contents first, but just the messages that it actually needs.
Second one was to create a dedicated table which contains ids of last message for each conversation, and just that, so it could replace this subquery and make the query even simpler. The main problem with that was to make sure this table is synced with the message table, so 3 triggers were created - when inserting new message, when updating visibility or type of a message (two parameters that are taken into account when selecting last message) and when deleting a last message (to find the one that from now on should be the last one). Another issue is whether this can affect the event handling times, so I checked that by executing 3000x
processApplicationMessage
using current code and with this additional table which needs to stay synced using triggers when new message is inserted:Results show that there is no difference, so it's safe to use.
To have a perspective, here are the times of three main queries used to fetch conversation list currently without the pagination:
Around 175ms to get the list without pagination.
As I said, with current paginated query I couldn't get a result in over 10 minutes, but with improvements explained above, it started to work in a very reasonable time:
A, B, C, D are just a combinations of these improvements for joining both message and unread event table, explained here (with average execution times):
It looks like the winner is D, and it's unbelievable that the execution of such complex query can take under 20ms, but so far it looks like everything is working correctly.
So here in this PR, the D approach is implemented - with new dedicated table
LastMessage
andUnreadEventCountsGrouped
view used in the main query which doesn't require anymore to be GROUPed BY.Testing
How to Test
Enable
paginated_conversation_list_enabled
feature flag and open the conversation list.PR Post Submission Checklist for internal contributors (Optional)
PR Post Merge Checklist for internal contributors
References
feat(conversation-list): Sort conversations by most emojis in the title #SQPIT-764
.