feat(aci): enqueue workflows for delayed processing #83548

cathteng · 2025-01-15T22:40:08Z

Adds the following logic to account for delayed processing of slow conditions:

Process fast conditions in the WHEN DataConditionGroup (DCG), note the workflows that need to have their slow condition(s) checked before proceeding
For workflows that need their slow conditions checked, evaluate all their IF DCGs to determine which ones would fire if the slow condition(s) pass
For workflows that need their slow conditions checked + have passing IF DCGs, enqueue them in the buffer for delayed processing. We collect the IF DCGs so we know which actions to fire if the slow conditions pass via DataConditionGroupAction, this is so we can only evaluate slow conditions in delayed processing.

codecov · 2025-01-15T23:15:25Z

Codecov Report

Attention: Patch coverage is 98.18182% with 2 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/sentry/workflow_engine/models/workflow.py	83.33%	1 Missing ⚠️
src/sentry/workflow_engine/processors/workflow.py	96.55%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #83548      +/-   ##
==========================================
+ Coverage   87.54%   87.60%   +0.05%     
==========================================
  Files        9408     9489      +81     
  Lines      537825   539196    +1371     
  Branches    21176    21176              
==========================================
+ Hits       470859   472370    +1511     
+ Misses      66618    66478     -140     
  Partials      348      348

src/sentry/workflow_engine/processors/workflow.py

tests/sentry/workflow_engine/processors/test_workflow.py

ceorourke

overall lgtm

This reverts commit 6a51006.

saponifi3d

i think the biggest change here is that we should not be filtering the workflows when we are trying to evaluate them. instead, we should filter before invoking evaluate. by filtering inside of the evaluation, it means we wouldn't be able to re-use this evaluation method in slow processing.

src/sentry/workflow_engine/processors/workflow.py

saponifi3d · 2025-01-17T17:16:50Z

src/sentry/workflow_engine/processors/workflow.py

+        if random.random() < 0.01:
+            logger.info(
+                "process_workflows.workflow_enqueued",
+                extra={"workflow": workflow.id, "group": event.group.id, "project": project_id},
+            )


can we audit the logging in here? I'm not sure we need a lot of these info logs anymore (this one for example is being sampled to 1% - which isn't super valuable)

@mifu67 are the logs you've downsampled to 1% for enqueue rules for delayed processing still useful?

she just did that cause they were noisy, i think there are a number of info logs here that we need to audit.

I'd recommend only keeping things that make sense to you. if you have any questions lemme know :)

src/sentry/workflow_engine/processors/workflow.py

saponifi3d · 2025-01-17T17:19:27Z

src/sentry/workflow_engine/processors/workflow.py

+def evaluate_workflow_triggers(
+    workflows: set[Workflow], job: WorkflowJob
+) -> tuple[set[Workflow], set[Workflow]]:
    triggered_workflows: set[Workflow] = set()
+    workflows_to_enqueue: set[Workflow] = set()

    for workflow in workflows:
        if workflow.evaluate_trigger_conditions(job):
            triggered_workflows.add(workflow)
+        else:
+            if get_slow_conditions(workflow):
+                # enqueue to be evaluated later
+                workflows_to_enqueue.add(workflow)

-    return triggered_workflows
+    return triggered_workflows, workflows_to_enqueue


i don't think we should be doing filtering or anything here - this method should be a pure method to evaluate the workflow triggers and that's it; if we want to filter the workflows being evaluated we should do that before evaluating them.

let's update the code to have process_workflows figure out what is fast / slow conditions, then filter based on fast / slow conditions.

we only enqueue workflows that need to have slow conditions evaluated because they don't pass after evaluating the fast conditions alone. are you saying to evaluate the workflows with slow conditions separately? some of them might be triggered immediately and some of them may have to be enqueued

updated to only return triggered_workflows

…ate slow conditions when the data is available

cathteng · 2025-01-17T19:27:12Z

src/sentry/workflow_engine/handlers/condition/event_frequency_handlers.py

@@ -59,7 +59,7 @@ def get_result(model: TSDBModel, group_ids: list[int]) -> dict[int, int]:


 @condition_handler_registry.register(Condition.EVENT_FREQUENCY_COUNT)
-class EventFrequencyCountHandler(EventFrequencyConditionHandler, DataConditionHandler[int]):
+class EventFrequencyCountHandler(EventFrequencyConditionHandler, DataConditionHandler[WorkflowJob]):


update this to evaluate WorkflowJob so we can reuse evaluate_workflow_triggers in delayed processing, and populate snuba_results inside WorkflowJob after we make the snuba queries

src/sentry/workflow_engine/processors/workflow.py

ceorourke · 2025-01-17T21:21:11Z

src/sentry/workflow_engine/handlers/condition/event_frequency_handlers.py

-    def evaluate_value(value: list[int], comparison: Any) -> DataConditionResult:
-        if len(value) != 2:
+    def evaluate_value(value: WorkflowJob, comparison: Any) -> DataConditionResult:
+        if len(value.get("snuba_results", [])) != 2:


is this a common scenario or a weird snuba blip?

it's possible we don't have the snuba results when we are evaluating the triggers outside of delayed processing

ceorourke · 2025-01-17T21:25:07Z

src/sentry/workflow_engine/processors/workflow.py

+        buffer.backend.push_to_sorted_set(key=WORKFLOW_ENGINE_BUFFER_LIST_KEY, value=project_id)
+
+        if_dcgs = workflow_action_groups.get(workflow.id, [])
+        if not if_dcgs:


this reads a little strange - why is the var name if_dcgs? could it just be dcgs?

these are IF data condition groups

🤔 maybe we call them workflow_action_filters? (since that should be the type of DCG here)

saponifi3d

overal, lgtm. i think we can do a bit more cleanup here, but i don't think we need to block on that.

🙏 thanks for addressing the feedback!

saponifi3d · 2025-01-17T23:10:22Z

src/sentry/workflow_engine/processors/workflow.py

+        buffer.backend.push_to_sorted_set(key=WORKFLOW_ENGINE_BUFFER_LIST_KEY, value=project_id)
+
+        if_dcgs = workflow_action_groups.get(workflow.id, [])
+        if not if_dcgs:


🤔 maybe we call them workflow_action_filters? (since that should be the type of DCG here)

saponifi3d · 2025-01-17T23:29:22Z

src/sentry/workflow_engine/processors/workflow.py

+                # enqueue to be evaluated later
+                workflows_to_enqueue.add(workflow)
+
+    enqueue_workflows(workflows_to_enqueue, job)


nit: save the couple of cpu cycles and only enqueue if we have something to enqueue

Suggested change

enqueue_workflows(workflows_to_enqueue, job)

if workflows_to_enqueue:

enqueue_workflows(workflows_to_enqueue, job)

saponifi3d · 2025-01-17T23:33:47Z

src/sentry/workflow_engine/types.py

@@ -40,6 +40,7 @@ class WorkflowJob(EventJob, total=False):
    has_alert: bool
    has_escalated: bool
    workflow: Workflow
+    snuba_results: list[int]


🤔 is the value in this that we can re-use evaluate_workflows method?

i think this is a bit of a smell that the abstraction might not be quite right in either delayed processing or the evaluate_workflow_triggers 🤔

mind adding a TODO here so i can come back and take a look? i'm not sure if this is the best approach, but seems okay for now.

yeah the value is the we can use at least the evaluate_workflow_triggers function. we'll already have processed the actions we can possibly fire before enqueuing so all we need to do is process the slow conditions, but i'm also not sure if it's the best way to do it

enqueue workflows for delayed processing

c3ec931

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jan 15, 2025

vercel bot deployed to Preview January 15, 2025 22:43 View deployment

update it to work

9f8ea09

vercel bot deployed to Preview January 15, 2025 23:36 View deployment

actually enqueue the workflow

8630b00

cathteng commented Jan 16, 2025

View reviewed changes

src/sentry/workflow_engine/processors/workflow.py Outdated Show resolved Hide resolved

vercel bot deployed to Preview January 16, 2025 00:14 View deployment

oopsie

30739b2

vercel bot deployed to Preview January 16, 2025 01:14 View deployment

smol ref

0ffd148

vercel bot deployed to Preview January 16, 2025 15:39 View deployment

cathteng commented Jan 16, 2025

View reviewed changes

src/sentry/workflow_engine/processors/workflow.py Outdated Show resolved Hide resolved

cathteng marked this pull request as ready for review January 16, 2025 15:43

cathteng requested a review from a team as a code owner January 16, 2025 15:43

cathteng requested review from ceorourke and saponifi3d January 16, 2025 15:43

ceorourke reviewed Jan 16, 2025

View reviewed changes

tests/sentry/workflow_engine/processors/test_workflow.py Outdated Show resolved Hide resolved

ceorourke approved these changes Jan 16, 2025

View reviewed changes

use different model to generate different hash

6a51006

cathteng requested review from a team as code owners January 16, 2025 22:38

cathteng added 2 commits January 16, 2025 14:39

Revert "use different model to generate different hash"

be626eb

This reverts commit 6a51006.

use different model to generate different hash

4b1708d

vercel bot deployed to Preview January 16, 2025 22:45 View deployment

saponifi3d requested changes Jan 17, 2025

View reviewed changes

refactor to enqueue workflows in evaluate_workflow_triggers and evalu…

44035ae

…ate slow conditions when the data is available

cathteng commented Jan 17, 2025

View reviewed changes

src/sentry/workflow_engine/processors/workflow.py Outdated Show resolved Hide resolved

Update src/sentry/workflow_engine/processors/workflow.py

6bbdb9a

vercel bot deployed to Preview January 17, 2025 19:32 View deployment

cathteng requested a review from saponifi3d January 17, 2025 19:58

ceorourke reviewed Jan 17, 2025

View reviewed changes

saponifi3d approved these changes Jan 17, 2025

View reviewed changes

nits

84ee9ce

vercel bot deployed to Preview January 17, 2025 23:44 View deployment

cathteng merged commit 3caa22e into master Jan 21, 2025
49 checks passed

cathteng deleted the cathy/aci/enqueue-workflows branch January 21, 2025 18:02

andrewshie-sentry pushed a commit that referenced this pull request Jan 22, 2025

feat(aci): enqueue workflows for delayed processing (#83548)

fdca387

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(aci): enqueue workflows for delayed processing #83548

feat(aci): enqueue workflows for delayed processing #83548

cathteng commented Jan 15, 2025 •

edited

Loading

codecov bot commented Jan 15, 2025 •

edited

Loading

ceorourke left a comment

saponifi3d left a comment

saponifi3d Jan 17, 2025

cathteng Jan 17, 2025

saponifi3d Jan 17, 2025

saponifi3d Jan 17, 2025

cathteng Jan 17, 2025

cathteng Jan 17, 2025

cathteng Jan 17, 2025

ceorourke Jan 17, 2025

cathteng Jan 17, 2025

ceorourke Jan 17, 2025

cathteng Jan 17, 2025

saponifi3d Jan 17, 2025

saponifi3d left a comment

saponifi3d Jan 17, 2025

saponifi3d Jan 17, 2025

saponifi3d Jan 17, 2025

cathteng Jan 17, 2025

	enqueue_workflows(workflows_to_enqueue, job)
	if workflows_to_enqueue:
	enqueue_workflows(workflows_to_enqueue, job)

feat(aci): enqueue workflows for delayed processing #83548

feat(aci): enqueue workflows for delayed processing #83548

Conversation

cathteng commented Jan 15, 2025 • edited Loading

codecov bot commented Jan 15, 2025 • edited Loading

Codecov Report

ceorourke left a comment

Choose a reason for hiding this comment

saponifi3d left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saponifi3d left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cathteng commented Jan 15, 2025 •

edited

Loading

codecov bot commented Jan 15, 2025 •

edited

Loading