Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: extremely slow paginated conversation list query [WPB-11808] #3098

Merged

Conversation

saleniuk
Copy link
Contributor

@saleniuk saleniuk commented Nov 8, 2024

BugWPB-11808 [Android] app stalls when fetching a page of conversations from the local DB


PR Submission Checklist for internal contributors

  • The PR Title

    • conforms to the style of semantic commits messages¹ supported in Wire's Github Workflow²
    • contains a reference JIRA issue number like SQPIT-764
    • answers the question: If merged, this PR will: ... ³
  • The PR Description

    • is free of optional paragraphs and you have filled the relevant parts to the best of your ability

What's new in this PR?

Issues

The paginated conversation list query is unable to show the list in a reasonable time, if someone has many conversations and messages then it can take multiple minutes.

Causes (Optional)

Left joining messages with multiple columns from different content tables and left joining unread events to just get last message for each conversation and count of different unread events is too far from optimal. We have managed to reduce full table scans but there is still a lot of data to work on by this query.

Solutions

To have a proper data to work on, I filled a database with ~1150 conversations and over 200 000 messages, biggest conversation had around 35 000 messages and 27 000 unread events.
With this DB, I waited more than 10 minutes using the current paginated query to get the result and eventually gave up. So I checked various ideas, below are the ones that looked most promising.

For unread events table, there are two ways: to left join all unread events into main query and calculate it or to make a subquery or view to first count events on just that table and then left join results of that calculations into the main query.

For message table, first idea was to match the solution implemented some time ago and to make a single subquery to get ids of last messages for each conversation and then left join it with all contents and other things - thanks to that it doesn't left join all messages and their contents first, but just the messages that it actually needs.
Second one was to create a dedicated table which contains ids of last message for each conversation, and just that, so it could replace this subquery and make the query even simpler. The main problem with that was to make sure this table is synced with the message table, so 3 triggers were created - when inserting new message, when updating visibility or type of a message (two parameters that are taken into account when selecting last message) and when deleting a last message (to find the one that from now on should be the last one). Another issue is whether this can affect the event handling times, so I checked that by executing 3000x processApplicationMessage using current code and with this additional table which needs to stay synced using triggers when new message is inserted:

image

Results show that there is no difference, so it's safe to use.

To have a perspective, here are the times of three main queries used to fetch conversation list currently without the pagination:
image

Around 175ms to get the list without pagination.
As I said, with current paginated query I couldn't get a result in over 10 minutes, but with improvements explained above, it started to work in a very reasonable time:

image

A, B, C, D are just a combinations of these improvements for joining both message and unread event table, explained here (with average execution times):

image

It looks like the winner is D, and it's unbelievable that the execution of such complex query can take under 20ms, but so far it looks like everything is working correctly.
So here in this PR, the D approach is implemented - with new dedicated table LastMessage and UnreadEventCountsGrouped view used in the main query which doesn't require anymore to be GROUPed BY.

Testing

How to Test

Enable paginated_conversation_list_enabled feature flag and open the conversation list.


PR Post Submission Checklist for internal contributors (Optional)

  • Wire's Github Workflow has automatically linked the PR to a JIRA issue

PR Post Merge Checklist for internal contributors

  • If any soft of configuration variable was introduced by this PR, it has been added to the relevant documents and the CI jobs have been updated.

References
  1. https://sparkbox.com/foundry/semantic_commit_messages
  2. https://github.com/wireapp/.github#usage
  3. E.g. feat(conversation-list): Sort conversations by most emojis in the title #SQPIT-764.

@echoes-hq echoes-hq bot added the echoes: unplanned Any work item that isn’t part of the product or technical roadmap. label Nov 8, 2024
Copy link
Contributor

github-actions bot commented Nov 8, 2024

Test Results

3 258 tests  +6   3 151 ✅ +6   4m 52s ⏱️ +36s
  555 suites ±0     107 💤 ±0 
  555 files   ±0       0 ❌ ±0 

Results for commit 8051e1f. ± Comparison against base commit 83207c7.

♻️ This comment has been updated with latest results.

@datadog-wireapp
Copy link

datadog-wireapp bot commented Nov 8, 2024

Datadog Report

Branch report: fix/too-slow-paginated-conversation-list-query
Commit report: 20bd9fb
Test service: kalium-jvm

✅ 0 Failed, 3151 Passed, 107 Skipped, 35.41s Total Time

@codecov-commenter
Copy link

codecov-commenter commented Nov 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 52.64%. Comparing base (83207c7) to head (8051e1f).

Additional details and impacted files
@@                  Coverage Diff                  @@
##           release/candidate    #3098      +/-   ##
=====================================================
+ Coverage              52.61%   52.64%   +0.02%     
=====================================================
  Files                   1320     1321       +1     
  Lines                  51523    51615      +92     
  Branches                4779     4781       +2     
=====================================================
+ Hits                   27111    27173      +62     
- Misses                 22451    22486      +35     
+ Partials                1961     1956       -5     
Files with missing lines Coverage Δ
...tlin/com/wire/kalium/persistence/db/TableMapper.kt 100.00% <100.00%> (ø)
.../wire/kalium/persistence/db/UserDatabaseBuilder.kt 81.01% <100.00%> (+0.12%) ⬆️

... and 9 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 83207c7...8051e1f. Read the comment docs.

@saleniuk saleniuk requested review from a team, typfel, yamilmedina, alexandreferris, Garzas and ohassine and removed request for a team November 12, 2024 09:12
Copy link
Member

@vitorhugods vitorhugods left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH4N8Q2XYAAja1L

Amazing work. Truly amazing work!

@saleniuk saleniuk added this pull request to the merge queue Nov 13, 2024
Merged via the queue into release/candidate with commit 2c1b2da Nov 13, 2024
20 checks passed
@saleniuk saleniuk deleted the fix/too-slow-paginated-conversation-list-query branch November 13, 2024 11:03
github-actions bot pushed a commit that referenced this pull request Nov 13, 2024
)

* fix: extremely slow paginated conversation list query [WPB-11808]

* fix name

* handle case of moving messages to other conversation

* add tests
github-merge-queue bot pushed a commit that referenced this pull request Nov 14, 2024
…3102)

* fix: extremely slow paginated conversation list query [WPB-11808] (#3098)

* fix: extremely slow paginated conversation list query [WPB-11808]

* fix name

* handle case of moving messages to other conversation

* add tests

* trigger build

---------

Co-authored-by: Michał Saleniuk <[email protected]>
Co-authored-by: Michał Saleniuk <[email protected]>
@MohamadJaara MohamadJaara added echoes: technical-roadmap Work contributing to the Technical Roadmap, to improve our velocity or reduce the technical debt. and removed echoes: unplanned Any work item that isn’t part of the product or technical roadmap. labels Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
echoes: technical-roadmap Work contributing to the Technical Roadmap, to improve our velocity or reduce the technical debt. 👕 size: XL type: bug / fix 🐞
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants