Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helper task aggregation counters: implement batched background updates #3443

Open
branlwyd opened this issue Oct 18, 2024 · 3 comments
Open

Comments

@branlwyd
Copy link
Member

branlwyd commented Oct 18, 2024

Helper task aggregation counters are currently updated in a separate DB transaction from the rest of the aggregation writes, in order to avoid unnecessary write contention. Each update is started in its own background (tokio) task, and will use its own DB transaction.

During a recent incident, we saw this transaction failing due to failure to retrieve a DB connection from the pool. In the event of overload, we could queue up an arbitrary number of these tasks, each awaiting a DB transaction -- there is no limiting factor to the number of background tasks we would spawn.

Instead, we should switch to a model where we have a single background task that receives & occasionally writes updates to the task aggregation counters, similar to how application-level metrics are handled for report uploads. This would cap the number of tasks & DB transactions used to update to 1. And this background task would also be able to batch writes to the counters, as well as handling retries in the case that the counters cannot be written.

@inahga
Copy link
Contributor

inahga commented Oct 21, 2024

similar to how application-level metrics are handled for report uploads

FWIW report upload metrics don't run in their own task and are handled as part of report uploading. (In that sense, they're more in-line than even aggregation job handling at the moment). So for this issue we may also want to align align report uploads with this model.

@branlwyd
Copy link
Member Author

Well, report upload metrics are handled in a batched fashion, in a single background task, with a controlled amount of concurrency. You are correct that the report uploads happen in that background task as well -- I'm less confident we should adopt a separate task for report upload metrics just to separate these, though it wouldn't be too hard to implement.

@inahga
Copy link
Contributor

inahga commented Oct 21, 2024

Ah, you're right, I forget that report uploads are batched themselves. Yeah then a separate background task seems less necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants